Ask Your Question

Revision history [back]

click to hide/show revision 1
initial version

Here's an example of how to iterate through a Scala DataFrame and save the column names into variables that can be used in a for loop:

import org.apache.spark.sql.Row

val df = ... // your DataFrame here

for (row: Row <- df) {
  // iterate through the rows in the DataFrame

  for (i <- 0 until row.length) {
    // iterate through the columns in the row

    val colName = row.schema.fieldNames(i)
    // save the column name into a variable

    val colValue = row(i)
    // get the value in the column at this row

    // do some operation here with the column name and value...

  }
}

In this example, we first import the Row class from the org.apache.spark.sql package. We then define our DataFrame as df.

Next, we use a for loop to iterate through each row in the DataFrame. For each row, we use another for loop to iterate through each column in the row.

Inside this loop, we save the column name into a variable called colName by accessing the schema property of the row and using the fieldNames method to get an array of column names. We use the index i to access the correct column name from the array.

We also get the value in the column at this row by accessing the ith index of the row object. We save this value into a variable called colValue.

Finally, we can do some operation with colName and colValue inside the loop. The specific operation will depend on your use case.