
Azure Machine Learning (ML) is a cloud-based platform that allows you to train, deploy, and manage machine learning models. It's a powerful tool that can help you build and deploy models quickly and efficiently.
Azure ML provides a user-friendly interface that makes it easy to get started with machine learning, even if you have no prior experience. With Azure ML, you can create and manage your own machine learning models, or use pre-built models from the Azure ML model gallery.
The Azure ML platform supports a wide range of machine learning algorithms and frameworks, including scikit-learn and TensorFlow. This means you can use your favorite machine learning libraries and frameworks to build and deploy models on Azure ML.
With Azure ML, you can also automate the process of deploying your models to production, using the Azure ML automated deployment feature. This feature allows you to deploy your models to a variety of environments, including web APIs, mobile apps, and IoT devices.
Getting Started
To get started with Azure ML, you need to create an Azure Notebook and connect to your workspace. Import the azureml-core package to enable you to connect and write code that uses resources in the workspace.
First, create a handle to your workspace by creating an ml_client. This will give you a way to reference your workspace and manage resources and jobs. To do this, you need to have your subscription ID, resource group name, and workspace name handy. You can find these values in the Azure Machine Learning studio toolbar.
To train a machine learning model, you'll need to create a Python environment to run the experiment. This involves creating a folder to save all the Python scripts, reading the dataset using the Pandas library, and then submitting a job to train the model. You can use a command job to run a custom training script, which requires an environment, data, command job, and training script.
For your interest: Azure Data Studio Connect to Azure Sql
Connect to Workspace
To connect to your Azure workspace, you'll need to create a handle to it. This can be done by creating an MLClient, which is a way to reference your workspace and manage resources and jobs.
You'll need to enter your subscription ID, resource group name, and workspace name in the code. To find these values, select your workspace name in the Azure Machine Learning studio toolbar, and copy the value for workspace, resource group, and subscription ID into the code.
Creating an MLClient doesn't connect to the workspace immediately. The client initialization is lazy, waiting for the first time it needs to make a call.
To access resources in your Azure workspace, you'll need to authenticate using the Azure Active Directory. This will allow you to use the workspace and its resources.
Here are the specific resources you'll need to access in your Azure workspace:
- Storage account to store data for model training
- Applications Insights to monitor predictive services
- Azure Key Vault to manage credentials
Once you have your MLClient, you can use it to connect to your workspace and start working with its resources.
Using Python
=====================================================
To get started with machine learning, you'll need to train a model using Python. Here's how to do it using the Azure ML framework in 8 simple steps.
First, create a folder to save all your Python scripts. This will keep your code organized and make it easier to find what you need.
Next, you'll need to read the dataset using the Pandas library. This library is a must-have for any data scientist.
You'll also need to create a Python environment to run the experiment. This will ensure that your code runs smoothly and efficiently.
Here are the basic components you'll need to train a model using Python:
- Environment
- Data
- Command job
- Training script
Let's take a closer look at each of these components.
- Environment: This is the setting in which your code will run. You can think of it as the "stage" where your experiment will take place.
- Data: This is the information you'll be using to train your model. Make sure it's clean and well-organized.
- Command job: This is the script that will run your training code. It's like a recipe for your experiment.
- Training script: This is the code that will actually train your model. It's where the magic happens!
Git Integration
Getting started with Azure Machine Learning involves connecting to a remote compute instance through the Azure Machine Learning VS Code extension.
This allows you to use VS Code's built-in Git support, making it easier to manage your code and collaborate with others.
By connecting to a remote compute instance, you'll be able to use VS Code's built-in Git support, which is a huge time-saver for developers.
Using VS Code's built-in Git support means you can easily track changes, make commits, and push updates to your repository.
With this integration, you can streamline your workflow and focus on building and training models.
Key Capabilities
Azure Machine Learning has some fantastic features that make it a breeze to work with, especially if you're not familiar with setting up a machine learning workflow from scratch.
One of the key capabilities is on-demand compute that you can customize based on your workload. This means you can scale up or down as needed.
Azure Machine Learning also has a data ingestion engine that can accept a wide range of sources. It's extensive, to say the least.
With Azure Machine Learning, workflow orchestration is incredibly simple. You can easily manage the flow of your machine learning process.
A unique perspective: Azure Workflow
If you like to evaluate multiple models before selecting the final one, Azure Machine Learning has dedicated capabilities to manage this. This is a huge time-saver.
Here are the key features of Azure Machine Learning at a glance:
- On-demand compute
- Data ingestion engine
- Workflow orchestration
- Machine Learning model management
- Metrics & logs of all model training activities
- Model deployment
To get started, simply select "New compute instance" from the left navigation menu.
Data Management
Data Management in Azure ML is a crucial step in building accurate predictive models. You can use Azure Open Datasets to get curated public datasets, like the MNIST dataset, which can add scenario-specific features to your machine learning solutions.
To manage big data, Azure ML offers multiple services such as Azure SQL Database, Azure Cosmos DB, and Azure Data lake. You can also utilize services like Apache Spark engines in Azure HDInsight and Databricks to transfer and transform big data.
To prepare your data for modeling, you can use the automated mode in Azure Machine Learning Studio, which includes capabilities like imputing missing values, encoding categorical features, and balancing data.
For your interest: Azure Ml Services
Handle to Workspace
To get started with managing your data, you'll need a handle to your workspace. This is where all your resources, including data for model training, Notebooks, and Experiments, are stored.
First, you need to create an Azure Machine Learning client to manage resources and jobs. You can do this by creating an ml_client for a handle to the workspace.
To create an ml_client, you'll need to copy the values for your subscription ID, resource group name, and workspace name from the Azure Machine Learning studio toolbar.
Here's what you'll need to copy:
Once you have these values, you can create an ml_client by initializing it with the subscription ID, resource group name, and workspace name. This will give you a handle to your workspace, allowing you to manage resources and jobs.
Managing Big Data
Managing big data is a challenge many of us face, but there are tools that can help. Azure ML offers multiple services like Azure SQL Database and Azure Cosmos DB to ingest voluminous data for building predictive models.
Big data can be transferred and transformed using services like Apache Spark engines in Azure HDInsight and Databricks. This helps us prepare our data for machine learning tasks.
Azure ML's services like Azure Data lake enable us to store and manage large amounts of data. This is especially useful when working with big data.
A fresh viewpoint: Azure as a Service
Import Data
Importing data is a crucial step in machine learning. You can use Azure Open Datasets to get the raw MNIST data files.
Azure Open Datasets are curated public datasets that you can use to add scenario-specific features to machine learning solutions. Each dataset has a corresponding class, such as MNIST, to retrieve the data in different ways.
To import data, you'll need to download the MNIST dataset. You can do this using the Azure Open Datasets service.
Displaying sample images is also an important step. This will help you understand the data you're using to train your model.
The load_data function, included in an utils.py file, is used to parse the compressed files into numpy arrays. This function is placed in the same folder as the notebook.
Here's a summary of the steps to import data:
- Download the MNIST dataset using Azure Open Datasets
- Display some sample images to understand the data
- Use the load_data function to parse the data into numpy arrays
Clean Missing Data
Cleaning missing data is a crucial step in data management, and Azure Machine Learning Studio provides several ways to do it. Most algorithms can't handle missing values and some treat them inconsistently, so it's essential to address this issue.
To identify missing values, you can use the "Summarize Data" module and connect it to your "Edit Metadata" module. This will give you a column summarizing the missing value count for each attribute.
In Azure ML, you can replace missing values with statistical values like mean, median, or mode. The median is usually preferred for machine learning because it preserves the distribution of the data and is less affected by outliers.
Related reading: Azure Azure-common Python Module
The "Clean Missing Data" module in Azure ML can apply a single blanket operation to the selected features. You can use it to replace all missing numeric instances with the median, as was done with the "Age" column.
In some cases, deletion might be the best option, especially if there are only a few missing values. This was the case with the "Embarked" column, where 2 missing values were dropped to avoid adding another categorical value.
Data Partitioning
Data Partitioning is crucial for evaluating a model's performance. You want to know how well it can predict future or unknown values, not just the ones it's already seen.
Randomly partitioning your data before training an algorithm is essential. This step allows you to test the validity and performance of your model.
A 70/30 split is a common industry practice for data partitioning. You can achieve this by setting "fraction of rows in the first output dataset" to 0.7.
By splitting your data in this way, 70% of the data will be randomly shuffled into one output node, while the remaining 30% will be shuffled into another.
Building and Deploying
Building and deploying machine learning models in Azure Machine Learning is a straightforward process. You can use one of three approaches: Expert mode, Azure ML studio-based Automated Machine learning, or Designer mode.
To build a model in Expert mode, you decide on the model to use, computing power, and dependencies. This approach is ideal for data scientists who want to use their programming knowledge to train models.
In Azure ML studio-based Automated Machine learning, you can evaluate multiple models and return the best performing one without getting into the nitty-gritty of building a model. This approach is useful for users who want to utilize the power of having a model in place.
You can deploy a model for real-time inference by deploying it to Azure Container Instance. To do this, you create a deployment configuration that specifies the dependencies required to host the model and the amount of compute required.
Here are the key steps to run an automated machine learning algorithm:
- Specify the dataset with labels to train the data.
- Configure the automated machine learning run – name, target label, and the compute target on which to run the experiment.
- Select the algorithm and settings to apply – classification, regression, or time-series, configuration settings, and feature settings.
- Review the best model generated.
Building
Building a machine learning model in Azure Machine Learning can be done in three approaches: Expert mode, Azure ML studio-based Automated Machine Learning, and Designer mode. In Expert mode, data scientists can use their knowledge of programming languages like Python and associated libraries like PyTorch and Scikit-Learn to train machine learning models.
To build a machine learning model, you need to create a Machine Learning Resource from the Azure Portal. This involves providing information like Workspace Name, Region, Storage account, and Application Insights to create the workspace. You also need to create a Compute Instance with a specified VM.
In Azure ML studio-based Automated Machine Learning, you can use the studio to evaluate multiple models and return the best performing one. This approach is useful for users who don't want to get into the nitty-gritty of building a model. The studio automation takes away the need for manual trial and error iterations that come with building a model.
Here are the key steps to run an automated machine learning algorithm:
- Specify the dataset with labels to train the data
- Configure the automated machine learning run with name, target label, and compute target
- Select the algorithm and settings to apply
- Review the best model generated
You can also create a training script to train the model and save it as a .py file in your folder. The script preprocesses the data, splits it into test and train data, and consumes the data to train a tree-based model. You can use MLFlow to log the parameters and metrics during this job.
The Azure Machine Learning Designer mode is a graphical utility that works along the lines of the No-code paradigm. It includes a wide range of pre-defined modules for data ingestion, feature selection and engineering, model training, and validation. You can add custom scripts, if needed, and use the explanation generator to understand the context and better interpret the results.
Here's a comparison of the three approaches:
Note: The table above highlights the main differences between the three approaches to building machine learning models in Azure Machine Learning.
Set Kernel in VS Code
Setting up your kernel in VS Code is a crucial step in building and deploying your project. This involves creating a compute instance and selecting the correct kernel.
First, create a compute instance if you don't already have one. It's essential to have a running compute instance for your kernel to work properly.
To check if your compute instance is running, look for the "Start compute" option on the top bar above your notebook. If it's stopped, click on it to start it.
Once your compute instance is running, ensure that the kernel on the top right is set to Python 3.10 - SDK v2. If not, use the dropdown list to select this kernel.
If you see a banner asking you to authenticate, click on "Authenticate" to proceed. This will allow you to access your compute instance and kernel.
Now that your kernel is set up, you can run your notebook or open it in VS Code for a full integrated development environment (IDE). To do this, select "Open in VS Code" and choose either the web or desktop option.
Specification File Authoring
Specification File Authoring is a crucial step in building and deploying projects. You can simplify the process by using the Azure ML command in the Command Palette, which can be accessed by pressing ⇧⌘P on Windows or Linux, or Ctrl+Shift+P.
This command provides a streamlined way to create specification files. The Azure Machine Learning View in VS Code is also available for specification file authoring.
Using these tools can save you time and effort, making the process more efficient. You can now focus on other important aspects of your project.
Frequently Asked Questions
Is Azure ML easy to learn?
Azure ML Studio offers a user-friendly interface that simplifies machine learning tasks, making it accessible to users without extensive programming knowledge. Its drag-and-drop functionality streamlines the process, making it easier to learn and use.
What is Azure ML used for?
Azure ML is a comprehensive platform for building, deploying, and managing machine learning models, including fine-tuning and integrating language models into applications. It enables users to create scalable and secure AI solutions with ease.
How to run ML models in Azure?
To run ML models in Azure, register your model, create an endpoint and deployment, and then manually scale and manage traffic between them. Follow these steps to successfully deploy and manage your machine learning models on the Azure platform.
Sources
- https://www.analyticsvidhya.com/blog/2021/09/a-comprehensive-guide-on-using-azure-machine-learning/
- https://code.visualstudio.com/docs/datascience/azure-machine-learning
- https://learn.microsoft.com/en-us/AZURE/machine-learning/tutorial-train-deploy-notebook
- https://datasciencedojo.com/blog/azure-ml-tutorial/
- https://learn.microsoft.com/en-us/azure/machine-learning/tutorial-train-model
Featured Images: pexels.com