The issue with the .display() function on a Pandas DataFrame in PySpark on Databricks is that it is not a built-in function for PySpark dataframes, which are distributed across many nodes. The .display() function is only available for Pandas dataframes, which are a single node object. Hence, to use the .display() function on a PySpark dataframe in Databricks, users need to convert it to a Pandas dataframe first, which may not be feasible for large datasets.
Asked: 2023-07-02 03:00:27 +0000
Seen: 9 times
Last updated: Jul 02 '23