Ask Your Question
3

How can StructType be dynamically created in PySpark?

asked 2023-06-03 10:13:46 +0000

scrum gravatar image

edit retag flag offensive close merge delete

1 Answer

Sort by ยป oldest newest most voted
3

answered 2023-06-03 10:14:01 +0000

devzero gravatar image

In PySpark, a StructType can be dynamically created using the StructType() constructor and adding StructField() to it.

Here's an example:

from pyspark.sql.types import *

# Define the schema dynamically
schema = StructType([
  StructField("id", IntegerType()),
  StructField("name", StringType()),
  StructField("age", IntegerType())
])

# Create the DataFrame with the dynamic schema
df = spark.createDataFrame(data, schema)

In this example, we define the schema dynamically by creating a new StructType object and adding StructField objects to it. We can specify the field name and data type for each field. Finally, we create a DataFrame using the dynamic schema by calling spark.createDataFrame() with the schema as an argument.

Note: data should be a list of tuples, where each tuple contains the data for each row in the DataFrame.

edit flag offensive delete link more

Your Answer

Please start posting anonymously - your entry will be published after you log in or create a new account. This space is reserved only for answers. If you would like to engage in a discussion, please instead post a comment under the question or an answer that you would like to discuss

Add Answer


Question Tools

Stats

Asked: 2023-06-03 10:13:46 +0000

Seen: 12 times

Last updated: Jun 03 '23