There is a known issue with the timestamp datatype in a Parquet file using pyarrow where timestamps with nanosecond precision cannot be read correctly by some software systems or versions. This is because, in some cases, pyarrow writes the timestamp values using a format that is not fully compliant with the Parquet specification. This can cause data loss or inaccurate results when the Parquet file is read by other systems that do not support this format. To resolve this issue, it is recommended to either use a timestamp with microsecond precision, or to manually adjust the timestamp values before writing them to the Parquet file.
Please start posting anonymously - your entry will be published after you log in or create a new account. This space is reserved only for answers. If you would like to engage in a discussion, please instead post a comment under the question or an answer that you would like to discuss
Asked: 2022-01-25 11:00:00 +0000
Seen: 15 times
Last updated: Oct 09 '21
What is the method to eliminate the tail and timestamp from the output of fluentbit's stdout?
What is the method for obtaining the current timestamp using Carbon in Laravel 5?
What is the process for obtaining the timestamp of the day's highest quote in KDB?
How can a default timestamp be incorporated into a table in Snowflake?