feat(isthmus): add PostgreSQL TPC-H integration testing#700
feat(isthmus): add PostgreSQL TPC-H integration testing#700nielspardon wants to merge 3 commits intosubstrait-io:mainfrom
Conversation
|
I am working on a fuller response here, but just wanted to see if you were aware of the existence of this related-seeming repository? |
I am aware of the repo. The problem with the consumer-testing repo is that by design it is a centralized approach and hence difficult to maintain. There is currently no active maintenance of it. I would suggest to not do the consumer testing centrally but to provide a Substrait testkit that integrations can test against in a decentralized setup. This brings the testing closer to the code being tested allowing for shorter cycles. This e.g. what a project like dbt does with their many adapters. @andrew-coleman has been looking into updating the consumer-testing repo to a more recent spec version and it led him down a rabbit hole of many of the currently tested consumers not having updated to the latest spec version. In some cases ending up with seemingly unmaintained consumers like e.g. Apache Ibis. |
|
Have these tests caught any broken behavior? I am apprehensive to introduce 90,000 lines of code in this PR. Is there some way to make these test files dynamically fetched from their source? E.g. an existing java package which wraps these static files, or a URL from which they can be cloned? If I understand correctly, these files are intended to never be modified. |
yes, see above:
--
most of it is not code but test data --
There is no existing source to fetch them from. There is no official TPC dataset but TPC provides a data generator with which one can generate the test data. What we could do is we can move the test data into the consumer-testing repo and transform the consumer-testing repo into a testkit repo instead of a centralized testing repository which performs the testing. |
a9b79fe to
aa156dd
Compare
Signed-off-by: Niels Pardon <[email protected]>
Signed-off-by: Niels Pardon <[email protected]>
Signed-off-by: Niels Pardon <[email protected]>
aa156dd to
14cf3b5
Compare
This PR adds integration testing for isthmus against a PostgreSQL database.
*.tblfiles*.tblfilesqueriessubfolder e.g.interval '1 day'instead ofinterval '1' day(3).This is currently a draft since I want to get some early feedback. What I'm still planning to do: