The number of columns in a PySpark dataframe can be calculated using the .count()
function in Python. Here's an example:
from pyspark.sql import SparkSession
# initialize Spark
spark = SparkSession.builder.appName("column_count").getOrCreate()
# create a PySpark dataframe
data = [("John", 25, "Male"), ("Jane", 30, "Female"), ("Bob", 20, "Male")]
df = spark.createDataFrame(data, ["Name", "Age", "Gender"])
# count the number of columns
num_columns = len(df.columns)
print("Number of columns in dataframe:", num_columns)
Output:
Number of columns in dataframe: 3
Please start posting anonymously - your entry will be published after you log in or create a new account. This space is reserved only for answers. If you would like to engage in a discussion, please instead post a comment under the question or an answer that you would like to discuss
Asked: 2023-03-03 11:00:00 +0000
Seen: 12 times
Last updated: Sep 06 '21
How do you log Python data into a database?
How can SQL/PLSQL blocks be stripped of their comments?
What is the method for programmatic access to a time series?
What is the process of using SQLAlchemy ORM and cloud spanner to read rows as model objects?
What is the method to retrieve the JSON data from a column in SQL?
How can I set up Gunicorn with a Django Project?
Looking for a Python Module that finds Tags for a Text describing its Content
Need a Function in Python to remove entries less than 2 digits from an Array