Ask Your Question

Revision history [back]

click to hide/show revision 1
initial version

No, it is not impossible to retrieve secrets saved in Secret Manager using cloud dataproc (Pyspark). You can use the Secret Manager API to retrieve the secrets and pass them to your Pyspark code. Here's an example:

First, authenticate with the Secret Manager API using the appropriate credentials:

from google.cloud import secretmanager

# Authenticate with the Secret Manager API
client = secretmanager.SecretManagerServiceClient()

Next, retrieve the secret:

# Retrieve the secret
project_id = '<your-project-id>'
secret_id = '<your-secret-id>'
version_id = '<your-version-id>' # optional

name = f"projects/{project_id}/secrets/{secret_id}/versions/{version_id}"
response = client.access_secret_version(request={"name": name})

# Extract the secret value
secret_string = response.payload.data.decode('UTF-8')

Finally, pass the secret to your Pyspark code:

from pyspark.sql import SparkSession

spark = SparkSession.builder \
    .appName("MyApp") \
    .config("spark.some.config.option", "some-value") \
    .config("mysecret", secret_string) \ # pass the secret as a Spark configuration option
    .getOrCreate()