To connect to and authenticate with Delta Lake on Azure Data Lake Storage Gen 2 using the delta-rs Python API, you can follow these steps:
Install the delta-rs Python package using pip:
pip install delta-rs
Import the DeltaTable
class from the delta
module:
from delta import DeltaTable
Create a DeltaTable
instance, specifying the path to the Delta Lake table on ADLS Gen2:
table = DeltaTable(path="adl://<storage-account-name>.dfs.core.windows.net/<path-to-delta-lake-table>")
Set the ADLS Gen2 credentials by setting the following environment variables:
export AZURE_STORAGE_ACCOUNT=<storage-account-name>
export AZURE_STORAGE_KEY=<storage-account-key>
You can also set these variables programmatically using the os
module:
import os
os.environ["AZURE_STORAGE_ACCOUNT"] = "<storage-account-name>"
os.environ["AZURE_STORAGE_KEY"] = "<storage-account-key>"
Use the DeltaTable
instance to query, modify, or manipulate the Delta Lake table as needed:
table.to_df().show()
table.delete() # deletes the table
table.vacuum() # cleans up the table by removing old versions
Asked: 2022-03-05 11:00:00 +0000
Seen: 8 times
Last updated: Sep 29 '21