Getting Started with Azure MLflow and Azure Machine Learning

Author

Reads 664

An artist’s illustration of artificial intelligence (AI). This image represents how machine learning is inspired by neuroscience and the human brain. It was created by Novoto Studio as par...
Credit: pexels.com, An artist’s illustration of artificial intelligence (AI). This image represents how machine learning is inspired by neuroscience and the human brain. It was created by Novoto Studio as par...

Azure MLflow is a powerful tool that helps you manage and track machine learning models, making it easier to collaborate and deploy them at scale. It integrates seamlessly with Azure Machine Learning, allowing you to create, manage, and monitor your ML models in a single platform.

To get started with Azure MLflow and Azure Machine Learning, you need to have an Azure subscription. You can sign up for a free account on the Azure website. This will give you access to all the features of Azure MLflow and Azure Machine Learning.

Azure MLflow provides a simple and intuitive interface for creating and managing ML models. You can create a new MLflow project in Azure Machine Learning by clicking on the "New Project" button in the Azure portal. This will guide you through the process of setting up your project and creating your first ML model.

On a similar theme: Manage Azure

Setting Up and Running ML

To set up and run MLflow projects in Azure Machine Learning, you'll need to create a backend configuration object. This object specifies the remote compute cluster you want to use for running your project, and is referenced by the COMPUTE parameter.

Credit: youtube.com, An Intro to MLflow and Azure ML

The backend configuration object is created as a JSON file, such as backend_config.json, where you indicate the name of your remote compute cluster, like "cpu-cluster". For example: {"COMPUTE": "cpu-cluster"}.

You'll also need to add the azureml-mlflow package as a pip dependency to your environment configuration file, conda.yaml, to track metrics and key artifacts in your workspace. This is done by adding the following lines to your conda.yaml file: - pip: - mlflow - azureml-mlflow.

Setup the Pipeline and Run the ML Deployment

To set up the pipeline and run the ML deployment, you'll need to create a backend configuration object. This object will reference the name of your remote compute cluster where your project will run. For example, you can create a backend configuration object using the following code: `backend_config = {"COMPUTE": "cpu-cluster"}`.

You'll also need to add the `azureml-mlflow` package as a pip dependency to your environment configuration file. This will allow you to track metrics and key artifacts in your workspace. Here's an example of how to do this in your `conda.yaml` file:

Intriguing read: Cluster in Azure

Credit: youtube.com, Deploy ML model in 10 minutes. Explained

```markdown

name: mlflow-example

channels:

- defaults

dependencies:

  • numpy>=1.14.3
  • pandas>=1.0.0
  • scikit-learn
  • pip:
  • mlflow
  • azureml-mlflow

```

Once you have your backend configuration object and environment configuration file set up, you can submit the local run and ensure you set the parameter `backend = "azureml"`. This will add support for automatic tracking, model capture, log files, snapshots, and printed errors in your workspace. For example, you can use the following code to submit the local run: `mlflow run . --backend azureml --backend-config backend_config.json -P alpha=0.3`.

To view your runs and metrics in the Azure Machine Learning studio, simply navigate to the studio and click on the "Runs" tab. From there, you can view all of your runs and metrics in one place.

Run Tracks Locally or Remotely

You can track runs from your local machine or remote compute using MLflow with Azure Machine Learning. This allows you to store logged metrics and artifacts runs that were executed on your local machine into your Azure Machine Learning workspace.

Curious to learn more? Check out: Learning Azure

Credit: youtube.com, All You Need To Know About Running LLMs Locally

To track runs running on Azure Machine Learning, you can use remote runs. These let you train your models in a more robust and repetitive way, and can leverage more powerful computes, such as Machine Learning Compute clusters.

Azure Machine Learning automatically configures MLflow to work with the workspace the run is running in, so you don't need to configure the MLflow tracking URI. Experiments are also automatically named based on the details of the experiment submission.

You can submit training jobs to Azure Machine Learning by using MLflow Projects. This feature is currently in public preview and will be fully retired in September 2026.

Here are the ways to submit training jobs to Azure Machine Learning:

  • Submit jobs locally with Azure Machine Learning tracking
  • Migrate your jobs to the cloud via Azure Machine Learning compute

Note that you can also use Azure Machine Learning Jobs, using either the Azure CLI or the Azure Machine Learning SDK for Python (v2), as a recommended way to track machine learning workloads in Azure Machine Learning.

Model Governance and Management

Credit: youtube.com, MLflow and Azure Machine Learning—The Power Couple for ML Lifecycle Management -Nishant Thacker

Model governance and management are crucial aspects of any machine learning workflow. Azure Machine Learning supports MLflow for model management, making it a convenient option for users familiar with the MLflow client.

With Azure Machine Learning, you can register and track your models with the model registry, which supports the MLflow model registry. This allows for easy export and import of models across different workflows.

To register a model, you can use the `register_model()` method, passing in the model URI and a name for the registered model. For example, `mlflow.register_model(model_uri, "registered_model_name")`.

Here are the steps to register and view a model from a run:

  1. Call the `register_model()` method after a run is complete, passing in the model URI and a name for the registered model.
  2. View the registered model in your workspace with Azure Machine Learning studio.
  3. Select the Artifacts tab to see all the model files that align with the MLflow model schema.
  4. Select MLmodel to see the MLmodel file generated by the run.

The Azure Machine Learning model registry aligns with the MLflow model schema, making it easy to manage models across different workflows.

Configuring and Troubleshooting

To connect MLflow to an Azure Machine Learning workspace, you need the tracking URI for the workspace, which can be obtained using the Azure Machine Learning SDK v2 for Python. You can also use the Azure Machine Learning portal to get the tracking URI.

Credit: youtube.com, MLflow with Azure Machine Learning

The tracking URI is constructed using the subscription ID, region, resource group name, and workspace name. For private link-enabled workspaces, you need to get the tracking URI using the Azure ML SDK or CLI v2.

To troubleshoot authentication issues, you can increase the logging level to get more details about the error. MLflow tries to authenticate to Azure Machine Learning on the first operation that interacts with the service. If you find issues or unexpected authentication prompts during the process, you can increase the logging level to get more details about the error.

To use MLflow tracking and model registry in an Azure Machine Learning workspace, you need the following permissions: To use MLflow tracking and To use MLflow model registry. Some default roles like AzureML Data Scientist or Contributor are already configured to perform MLflow operations in an Azure Machine Learning workspace.

Worth a look: How to Use Azure

Configure Authentication

Configuring authentication is a crucial step in setting up MLflow with Azure Machine Learning. The Azure Machine Learning plugin for MLflow supports several authentication mechanisms through the package azure-identity.

An artist’s illustration of artificial intelligence (AI). This image represents how machine learning is inspired by neuroscience and the human brain. It was created by Novoto Studio as par...
Credit: pexels.com, An artist’s illustration of artificial intelligence (AI). This image represents how machine learning is inspired by neuroscience and the human brain. It was created by Novoto Studio as par...

By default, the plugin performs interactive authentication by opening the default browser to prompt for credentials. However, this approach isn't suitable for unattended environments like training jobs. Instead, you can configure a service principal to communicate with Azure Machine Learning.

You can configure authentication using environment variables, such as AZURE_CLIENT_SECRET, or using a certificate. If you're working on shared environments, it's recommended to set these environment variables at the compute level and manage them as secrets in an instance of Azure Key Vault.

Here are the authentication methods tried by the Azure Machine Learning plugin for MLflow:

  • Environment: Reads account information specified via environment variables and uses it to authenticate.
  • Managed Identity: If the application is deployed to an Azure host with Managed Identity enabled, it authenticates with it.
  • Azure CLI: If a user signs in via the Azure CLI az login command, it authenticates as that user.
  • Azure PowerShell: If a user signs in via Azure PowerShell's Connect-AzAccount command, it authenticates as that user.
  • Interactive browser: Interactively authenticates a user via the default browser.

To troubleshoot authentication issues, you can increase the logging level to get more details about the error.

Experiment and Run Management

Experiment and run management is a crucial aspect of Azure MLflow. You can organize all of your machine learning experiments in a single place, making it easier to search, filter, and compare experiments.

With Azure MLflow, you can easily compare experiments, analyze results, and debug model training. This is achieved through experiment tracking, which allows you to see details about previous experiments and reproduce or rerun experiments to validate results.

Credit: youtube.com, Being Platform Agnostic with MLflow in Azure Machine Learning

You can also use MLflow to manage your experiments and runs, including tracking metrics, parameters, and models. This is possible through the MLflow SDK, which provides a range of features for experiment and run management.

Here are some key features of experiment and run management in Azure MLflow:

  1. Track and log metrics, parameters, and models
  2. Retrieve metrics, parameters, and models
  3. Submit training jobs
  4. Manage experiments and runs
  5. Manage MLflow models
  6. Manage non-MLflow models
  7. Deploy MLflow models to Azure Machine Learning (online and batch)
  8. Deploy non-MLflow models to Azure Machine Learning

Benefits of Experiments

Experimenting with machine learning can be a complex and time-consuming process, but having the right tools can make all the difference. Experiment tracking helps you organize all of your machine learning experiments in a single place, where you can search and filter experiments and drill down to see details about previous experiments.

With experiment tracking, you can easily compare experiments, analyze results, and debug model training. This is especially useful when you're trying to figure out why a particular experiment didn't work out as planned.

Experiment tracking also allows you to reproduce or rerun experiments to validate results, which is crucial for ensuring that your models are working correctly. This can save you a lot of time and effort in the long run.

Credit: youtube.com, Experiment Management for Machine Learning

Here are some of the key benefits of experiment tracking:

  • Organize all of your machine learning experiments in a single place.
  • Easily compare experiments, analyze results, and debug model training.
  • Reproduce or rerun experiments to validate results.
  • Improve collaboration, because you can see what other teammates are doing, share experiment results, and access experiment data programmatically.

By using experiment tracking, you can streamline your machine learning workflow and make it easier to collaborate with others. This can help you to develop more accurate and reliable models, and to get your projects done faster.

Configuring the Experiment

To configure the experiment, you need to use Python to submit the experiment to Azure Machine Learning. In a notebook or Python file, configure your compute and training run environment with the Environment class.

You can set the experiment name using the property experiment_name in the YAML definition of the job when submitting jobs using Azure Machine Learning CLI v2. Alternatively, you can use the MLflow command mlflow.set_experiment() to configure the experiment.

To submit a run, use the Experiment.submit() method to submit a run. This method automatically sets the MLflow tracking URI and directs the logging from MLflow to your Workspace.

Here are the steps to configure the experiment:

  • Set the experiment name using the property experiment_name in the YAML definition of the job.
  • Use the MLflow command mlflow.set_experiment() to configure the experiment.
  • Configure your compute and training run environment with the Environment class.
  • Submit a run using the Experiment.submit() method.

Track Runs

Credit: youtube.com, Running experiments to help your company perform more efficiently

You can track runs in Azure Machine Learning by using MLflow with Azure Machine Learning workspaces. This allows you to store logged metrics and artifacts from runs executed on your local machine into your Azure Machine Learning workspace.

MLflow automatically configures the tracking URI when submitting runs, so you don't need to configure it yourself. You can also automatically name experiments based on the details of the experiment submission.

To track runs, you can use the MLflow SDK in Python to log metrics, parameters, and artifacts. You can also use the Azure Machine Learning CLI/SDK v2 to track and log metrics, parameters, and models.

Here are some ways to track runs in Azure Machine Learning:

  • Use the `start_run()` method to start a training run and log metrics with `log_metric()`.
  • Use the `mlflow.projects.run()` method to submit an MLflow project as a job running on Azure Machine Learning compute.
  • Use the `Experiment.submit()` method to submit a run and automatically set the MLflow tracking URI.

By tracking runs in Azure Machine Learning, you can easily compare experiments, analyze results, and debug model training. You can also reproduce or rerun experiments to validate results.

Machine Learning Capabilities

Azure MLflow offers a range of machine learning capabilities that make it an attractive choice for data scientists and engineers.

Credit: youtube.com, MLflow: Platform for Complete Machine Learning Lifecycle by Quentin Ambard

You can track and log metrics, parameters, and models with the MLflow SDK. With Azure Machine Learning CLI/SDK v2, you can only download artifacts and models. In the Azure Machine Learning studio, you can track and log metrics, parameters, and models.

MLflow SDK also allows you to retrieve metrics, parameters, and models, but Azure Machine Learning CLI/SDK v2 can only download artifacts and models. The Azure Machine Learning studio can also retrieve metrics, parameters, and models.

Azure Machine Learning CLI/SDK v2 can submit training jobs, while the MLflow SDK can submit training jobs using MLflow Projects (preview). The Azure Machine Learning studio can also submit training jobs.

Here's a summary of the machine learning capabilities of Azure MLflow:

Getting Started

To get started with Azure MLflow, you'll need to install the MLflow SDK and the Azure Machine Learning plugin. This can be done by running the command `pip install mlflow azureml-mlflow`. You can also use the `mlflow-skinny` package, a lightweight version of MLflow that's perfect for users who only need tracking and logging capabilities.

Credit: youtube.com, MLFlow: A Quickstart Guide

To use Azure Machine Learning as a backend for your MLflow projects, you'll need to install the `azureml-core` package by running `pip install azureml-core`. This will give you the necessary tools to connect your MLflow environment to your Azure Machine Learning workspace.

Before you can start using Azure Machine Learning with MLflow, you'll need to create a workspace and review the access permissions required for your MLflow operations. You can do this by following the instructions in the Azure Machine Learning documentation to create resources and get started.

Prerequisites

To get started with MLflow and Azure Machine Learning, you'll need to meet some prerequisites. Install the MLflow SDK mlflow package and the Azure Machine Learning azureml-mlflow plugin for MLflow using pip install mlflow azureml-mlflow. You can also consider using the mlflow-skinny package for a more lightweight experience.

Create an Azure Machine Learning workspace by following the instructions in Create resources you need to get started. Review the access permissions you need to perform your MLflow operations in your workspace.

Credit: youtube.com, DevOps Prerequisites Course - Getting started with DevOps

To enable remote tracking, configure MLflow to point to the tracking URI of your Azure Machine Learning workspace. For more information, see Configure MLflow for Azure Machine Learning.

You'll also need to install the azureml-core package if you plan to use Azure Machine Learning as the backend for your MLflow projects. This can be done using pip install azureml-core.

Next Steps

Now that you've got your environment set up, it's time to start working with it. You can track ML experiments and models with MLflow.

To make the most of MLflow, you can start by deploying models. This is a crucial step in getting your project up and running. Deploying models with MLflow will allow you to put your machine learning models into production and start getting real-world results.

Monitoring your production models for data drift is also an important step. This will help you ensure that your models are still performing well over time and make any necessary adjustments.

Credit: youtube.com, FiftyOne Getting Started Workshop: Part 6 - Next Steps

Here are some key next steps to consider:

  • Deploy models with MLflow.
  • Monitor your production models for data drift.
  • Track Azure Databricks runs with MLflow.
  • Manage your models.

By following these steps, you'll be well on your way to getting your machine learning project off the ground. Remember to stay organized and keep track of your experiments and models as you go.

Java

Getting started with Java can be a bit tricky, but don't worry, I've got you covered. MLflow support in Java has some limitations, so it's good to know what to expect.

MLflow tracking in Java is limited to tracking experiment metrics and parameters on Azure Machine Learning jobs. This means you can't track artifacts and models directly with MLflow, but there's a workaround.

To save models or artifacts that you want to capture, use the mlflow.save_model method with the outputs folder in jobs. This is a simple yet effective solution to get around the limitations of MLflow tracking in Java.

Frequently Asked Questions

Is MLflow owned by Databricks?

MLflow is not owned by Databricks, but Databricks developed the open source platform. Databricks is the company behind the managed version of MLflow.

How do I deploy MLflow on Azure?

To deploy MLflow on Azure, use the AzureML deployment plugin and register your MLflow model with Azure Machine Learning. This will enable you to deploy it as a web service to Azure Container Instances or Azure Kubernetes Service.

What is the difference between MLflow and AutoML?

MLflow offers more flexibility for customized model tuning, while AutoML provides a user-friendly, high-level abstraction of the machine learning process for non-experts

Ismael Anderson

Lead Writer

Ismael Anderson is a seasoned writer with a passion for crafting informative and engaging content. With a focus on technical topics, he has established himself as a reliable source for readers seeking in-depth knowledge on complex subjects. His writing portfolio showcases a range of expertise, including articles on cloud computing and storage solutions, such as AWS S3.

Love What You Read? Stay Updated!

Join our community for insights, tips, and more.