What is the process of programmatically generating data asset using Python code from a data store URI in Azure ML studio?

answered 2022-07-24 14:00:00 +0000

lalupa
21 ●1 ●1

The process of programmatically generating data asset using Python code from a data store URI in Azure ML studio involves the following steps:

Import the necessary packages:

from azureml.core import Workspace, Datastore, Dataset
import pandas as pd

Initialize a Workspace object using the Azure ML SDK:
```
ws = Workspace.from_config()
```

Access the Datastore object using the data store URI:

datastore = Datastore.get(ws, datastore_name='<datastore_name>')

Read data from the Datastore using the Dataset object:

dataset = Dataset.Tabular.from_delimited_files(path=(datastore, '<dataset_file_path>'))
df = dataset.to_pandas_dataframe()

Manipulate the data as needed using pandas DataFrame operations:

df = df.drop(columns=['column_to_drop'])
df['new_column'] = df['column_a'] + df['column_b']

Write the updated data back to the Datastore using the Dataset object:

updated_dataset = Dataset.Tabular.register_pandas_dataframe(df, target=(datastore, '<updated_dataset_file_path>'), name='<dataset_name>', description='<description>')

The updated Dataset can be used further by passing it as input to the Azure ML training pipeline or other workflows.

edit flag offensive delete link

add a comment

Your Answer

Please start posting anonymously - your entry will be published after you log in or create a new account. This space is reserved only for answers. If you would like to engage in a discussion, please instead post a comment under the question or an answer that you would like to discuss

Add Answer

What is the process of programmatically generating data asset using Python code from a data store URI in Azure ML studio?

1 Answer

Your Answer

Question Tools

Stats

Related questions

What is the process of programmatically generating data asset using Python code from a data store URI in Azure ML studio? edit

1 Answer

Your Answer

Question Tools

Stats

Related questions

What is the process of programmatically generating data asset using Python code from a data store URI in Azure ML studio?