1 | initial version |
Here's an example of how to iterate through a Scala DataFrame and save the column names into variables that can be used in a for loop:
import org.apache.spark.sql.Row
val df = ... // your DataFrame here
for (row: Row <- df) {
// iterate through the rows in the DataFrame
for (i <- 0 until row.length) {
// iterate through the columns in the row
val colName = row.schema.fieldNames(i)
// save the column name into a variable
val colValue = row(i)
// get the value in the column at this row
// do some operation here with the column name and value...
}
}
In this example, we first import the Row
class from the org.apache.spark.sql
package. We then define our DataFrame as df
.
Next, we use a for loop to iterate through each row in the DataFrame. For each row, we use another for loop to iterate through each column in the row.
Inside this loop, we save the column name into a variable called colName
by accessing the schema
property of the row and using the fieldNames
method to get an array of column names. We use the index i
to access the correct column name from the array.
We also get the value in the column at this row by accessing the i
th index of the row
object. We save this value into a variable called colValue
.
Finally, we can do some operation with colName
and colValue
inside the loop. The specific operation will depend on your use case.