Ask Your Question
4

What is the reason for the inability to write a spark dataframe when encountering an error about a nested NullType in the column 'colname' that is of type ArrayType?

asked 2022-08-13 11:00:00 +0000

devzero gravatar image

edit retag flag offensive close merge delete

1 Answer

Sort by ยป oldest newest most voted
2

answered 2022-04-02 15:00:00 +0000

bukephalos gravatar image

The reason for the inability to write a spark dataframe when encountering an error about a nested NullType in the column 'colname' that is of type ArrayType is that the NullType is not supported in the Parquet file format. When writing a dataframe to Parquet, all null values in the data are converted to a special internal representation of null called the "null bitset," which is not compatible with the nested NullType. Therefore, if a nested NullType exists in a column of type ArrayType, the dataframe cannot be written to Parquet.

edit flag offensive delete link more

Your Answer

Please start posting anonymously - your entry will be published after you log in or create a new account. This space is reserved only for answers. If you would like to engage in a discussion, please instead post a comment under the question or an answer that you would like to discuss

Add Answer


Question Tools

Stats

Asked: 2022-08-13 11:00:00 +0000

Seen: 9 times

Last updated: Apr 02 '22