Ask Your Question

Revision history [back]

click to hide/show revision 1
initial version

There are a few possible reasons why Trino may display different rows than Spark Parquet Tables:

  1. Data inconsistency: It's possible that the data in the two sources is not consistent, meaning that there are differences in the values or formatting of certain columns that affect the results of the SQL queries.

  2. Query optimization: Trino and Spark use different query optimization techniques, which can impact how they process and display data. Trino may produce different results due to its use of cost-based optimization or other methods.

  3. Version differences: If the versions of Trino and Spark being used are different, this may cause differences in how they handle Parquet Tables, leading to differing results.

  4. Configuration differences: Trino and Spark may be configured differently, which impacts how they read and display data. For example, Trino may use a different configuration for selecting columns or reading data from Parquet Tables than Spark.

To resolve this issue, it may be necessary to investigate the specific cause of the discrepancies and adjust the configuration or query optimization techniques as needed to produce consistent results.