There could be several reasons why the Spark 3.2 driver experiences garbage collection while reading a JSON file whereas it does not happen in Spark 2.3. Some possible reasons are:
Changes in the underlying code: Spark 3.2 may have introduced changes in the code that result in more garbage collection while reading a JSON file. These changes could be related to the way Spark handles data serialization and deserialization, or the way it manages memory.
Differences in the configuration: Spark 3.2 and Spark 2.3 may have different default settings for memory allocation, garbage collection, or other configuration parameters that affect how much garbage collection is needed while reading a JSON file. It is possible that the Spark 3.2 driver is configured differently than the Spark 2.3 driver.
Size of the JSON file: If the JSON file being read is large, Spark 3.2 may need to allocate more memory and perform more garbage collection than Spark 2.3. This could be due to changes in the way Spark handles memory or the size of the objects being serialized and deserialized.
Dependencies used: The dependencies used in Spark 3.2 might be different from the dependencies used in Spark 2.3. This could affect the way Spark manages memory, which could result in more garbage collection in Spark 3.2.
In general, it is difficult to pinpoint the exact reason for the difference in garbage collection between Spark 3.2 and Spark 2.3 without more information about the specific use case and configurations.
Please start posting anonymously - your entry will be published after you log in or create a new account. This space is reserved only for answers. If you would like to engage in a discussion, please instead post a comment under the question or an answer that you would like to discuss
Asked: 2022-11-25 11:00:00 +0000
Seen: 1 times
Last updated: Jul 11 '21
How can I deal with Expression.Error related to a column in Power Query?
How can you implement pagination in Oracle for the LISTAGG() function?
What is the process for implementing a FutureBuilder on an OnTap function in Flutter?
How can we require users to be logged in before they can access the root folders in WordPress?
In SCSS, what is the method for grouping and reusing a set of classes and styles?
How can popen() be used to direct streaming data to TAR?
How does iOS retrieve information from a BLE device?
How can Django Admin accommodate a variety of formats and locales for its input fields?