Ask Your Question

Revision history [back]

click to hide/show revision 1
initial version

There are several methods for utilizing a Python variable in an SQL query in Databricks:

  1. Using string formatting:
variable = "some_value"
query = f"SELECT * FROM table WHERE column = '{variable}'"
spark.sql(query)
  1. Using parameterization:
from pyspark.sql.functions import expr
variable = "some_value"
query = "SELECT * FROM table WHERE column = ?"
spark.sql(query).bind([expr(variable)]).show()
  1. Using a temporary table:
from pyspark.sql.functions import lit
variable = "some_value"
spark.range(1).withColumn("value", lit(variable)).createOrReplaceTempView("temp_table")
query = "SELECT * FROM table WHERE column = (SELECT value FROM temp_table)"
spark.sql(query)