Ask Your Question
2

How can the delta-rs Python API be used to connect to and authenticate with Delta Lake on Azure Data Lake Storage Gen 2?

asked 2022-03-05 11:00:00 +0000

qstack gravatar image

edit retag flag offensive close merge delete

1 Answer

Sort by ยป oldest newest most voted
2

answered 2021-09-29 15:00:00 +0000

scrum gravatar image

To connect to and authenticate with Delta Lake on Azure Data Lake Storage Gen 2 using the delta-rs Python API, you can follow these steps:

  1. Install the delta-rs Python package using pip:

    pip install delta-rs
    
  2. Import the DeltaTable class from the delta module:

    from delta import DeltaTable
    
  3. Create a DeltaTable instance, specifying the path to the Delta Lake table on ADLS Gen2:

    table = DeltaTable(path="adl://<storage-account-name>.dfs.core.windows.net/<path-to-delta-lake-table>")
    
  4. Set the ADLS Gen2 credentials by setting the following environment variables:

    export AZURE_STORAGE_ACCOUNT=<storage-account-name>
    export AZURE_STORAGE_KEY=<storage-account-key>
    

    You can also set these variables programmatically using the os module:

    import os
    os.environ["AZURE_STORAGE_ACCOUNT"] = "<storage-account-name>"
    os.environ["AZURE_STORAGE_KEY"] = "<storage-account-key>"
    
  5. Use the DeltaTable instance to query, modify, or manipulate the Delta Lake table as needed:

    table.to_df().show()
    table.delete() # deletes the table
    table.vacuum() # cleans up the table by removing old versions
    
edit flag offensive delete link more

Your Answer

Please start posting anonymously - your entry will be published after you log in or create a new account. This space is reserved only for answers. If you would like to engage in a discussion, please instead post a comment under the question or an answer that you would like to discuss

Add Answer


Question Tools

Stats

Asked: 2022-03-05 11:00:00 +0000

Seen: 8 times

Last updated: Sep 29 '21