Ask Your Question
2

What is the issue with the .display() function on a Pandas DataFrame in PySpark on Databricks?

asked 2023-07-02 03:00:27 +0000

djk gravatar image

edit retag flag offensive close merge delete

1 Answer

Sort by ยป oldest newest most voted
3

answered 2023-07-02 03:09:02 +0000

plato gravatar image

The issue with the .display() function on a Pandas DataFrame in PySpark on Databricks is that it is not a built-in function for PySpark dataframes, which are distributed across many nodes. The .display() function is only available for Pandas dataframes, which are a single node object. Hence, to use the .display() function on a PySpark dataframe in Databricks, users need to convert it to a Pandas dataframe first, which may not be feasible for large datasets.

edit flag offensive delete link more

Your Answer

Please start posting anonymously - your entry will be published after you log in or create a new account. This space is reserved only for answers. If you would like to engage in a discussion, please instead post a comment under the question or an answer that you would like to discuss

Add Answer


Question Tools

Stats

Asked: 2023-07-02 03:00:27 +0000

Seen: 9 times

Last updated: Jul 02 '23