Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ParquetSerializer - A Parquet serializer for PxWeb #23

Merged
merged 10 commits into from
Oct 30, 2023
Merged

Conversation

trygu
Copy link
Contributor

@trygu trygu commented Jul 6, 2023

  • Apache Parquet support for PxWeb (https://parquet.apache.org).
  • Added integration tests for ParquetSerializer that uses DynamicData to run through all the test-files.
  • Added more test file.
  • Parquet-serialization uses three datatypes: DateTime for TIMEVALs, double for data and string for other variables.
  • Keeps the original TimeValue in a separate column in addition to a parsed DateTime column.
  • Handles NPM in separate columns for each content variable. Replaces values in the value-column with a double.NaN.
  • Slight fix of the ConsoleTestApp (might be a good idea to re-write it to be a proper console-app?).

@trygu trygu force-pushed the main branch 2 times, most recently from 4641c0a to 1473e5c Compare July 6, 2023 19:45
@trygu
Copy link
Contributor Author

trygu commented Aug 15, 2023

Ok. Now I promise it's complete.

@trygu trygu requested a review from likp September 26, 2023 08:24
@trygu trygu changed the title ParquetSerializer - Simple, first attempt at a Parquet serializer for PxWeb ParquetSerializer - A Parquet serializer for PxWeb Oct 30, 2023
@runejo runejo merged commit f59a980 into PxTools:main Oct 30, 2023
4 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants