Ask Your Question

Revision history [back]

click to hide/show revision 1
initial version

The issue with the .display() function on a Pandas DataFrame in PySpark on Databricks is that it is not a built-in function for PySpark dataframes, which are distributed across many nodes. The .display() function is only available for Pandas dataframes, which are a single node object. Hence, to use the .display() function on a PySpark dataframe in Databricks, users need to convert it to a Pandas dataframe first, which may not be feasible for large datasets.