If I have set the driver_memory and there are 8 writers, how can I adjust the row group sizes to fit within 95.00% of the heap memory, which is equivalent to 1,020,054,720 bytes, despite the total allocation exceeding this limit?

asked 2022-07-13 11:00:00 +0000

david
31 ●16 ●4

edit retag flag offensive close merge delete

add a comment

1 Answer

Sort by » oldest newest most voted

answered 2022-09-26 18:00:00 +0000

huitzilopochtli
41 ●4 ●3

One option would be to decrease the row group sizes until the total allocation falls within the 95.00% limit of the driver_memory. This can be done by adjusting the parquet.block.size and spark.sql.parquet.row.group.size parameters.

Another option would be to increase the driver_memory to accommodate the total allocation. However, this may not be feasible if there are other constraints or limitations in the system.

It's important to note that row group sizes should generally be chosen based on the data characteristics and workload requirements, rather than solely based on memory constraints.

edit flag offensive delete link

add a comment

Your Answer

Please start posting anonymously - your entry will be published after you log in or create a new account. This space is reserved only for answers. If you would like to engage in a discussion, please instead post a comment under the question or an answer that you would like to discuss

Add Answer

Question Tools

Stats

Asked: 2022-07-13 11:00:00 +0000

Seen: 20 times

Last updated: Sep 26 '22

If I have set the driver_memory and there are 8 writers, how can I adjust the row group sizes to fit within 95.00% of the heap memory, which is equivalent to 1,020,054,720 bytes, despite the total allocation exceeding this limit? edit

1 Answer