Ask Your Question

Revision history [back]

click to hide/show revision 1
initial version

The reason for the inability to write a spark dataframe when encountering an error about a nested NullType in the column 'colname' that is of type ArrayType is that the NullType is not supported in the Parquet file format. When writing a dataframe to Parquet, all null values in the data are converted to a special internal representation of null called the "null bitset," which is not compatible with the nested NullType. Therefore, if a nested NullType exists in a column of type ArrayType, the dataframe cannot be written to Parquet.