Education & E-Learning

Master MLflow: Track Experiments and Deploy Models

by John Digweed · 2 hours ago · 10 mins read · 0 Views

Master MLflow: Track Experiments and Deploy Models

Master MLflow: Track Experiments and Deploy Models

This guide provides a comprehensive introduction to MLflow, a powerful tool for managing the machine learning lifecycle. You’ll learn how to track experiments, version models, and integrate MLflow into your professional workflows, ultimately building reproducible and scalable ML systems. We’ll cover everything from local experiment tracking to deploying production-ready models.

Overview of What You’ll Learn

The fundamental reasons why experiment tracking is crucial for ML systems.
How MLflow addresses the limitations of traditional methods like Jupyter notebooks and Git for ML development.
Setting up MLflow on your local system and understanding its core components: tracking server, backend store, and artifact store.
Creating MLflow experiments and logging parameters, metrics, and artifacts.
Exploring the MLflow UI to visualize experiment runs and their associated details.

Prerequisites

Python installed on your system.
Basic understanding of machine learning concepts.
Familiarity with using the terminal or command prompt.

Step 1: Understanding the Need for ML Experiment Tracking

Before diving into MLflow, it’s essential to understand why traditional development methods fall short for machine learning. ML projects often start with a single Jupyter notebook, a dataset, and one model. While this works for individual research or very small teams, it quickly becomes unmanageable in larger organizations.

The Problem with Ad Hoc Experiments

Lack of Reproducibility: Without proper tracking, it’s difficult to recall the exact parameters, code versions, and environment settings used to train a specific model.
Confusion and Inconsistency: As more data scientists work on a project, individual naming conventions and manual tracking methods lead to chaos.
Memory Fallibility: Humans tend to overestimate their ability to remember the details of past experiments. Relying on memory is unreliable.
Probabilistic Nature of ML: Unlike traditional software, ML model outputs are probabilistic due to data and randomness. This means versioning goes beyond just code; it includes the entire decision history.

What Constitutes an ML Experiment?

An ML experiment is an encapsulation of several key components:

Code: The training scripts.
Data: The datasets used.
Parameters: Hyperparameters and other configuration settings.
Randomness: The inherent randomness in training processes.
Environment: The software packages and their versions (e.g., Python libraries).

Git effectively tracks code changes, but it doesn’t capture the data, parameters, or environment, which are critical for ML reproducibility.

Why Notebooks Don’t Scale

No Structured Metadata: Notebooks lack a systematic way to track multiple runs, parameters, or metrics within a single notebook or across different notebooks.
Difficult Run Comparison: Comparing different training runs is cumbersome and error-prone.

The Dangers in Production

Without proper tracking, ML systems lose their decision history. This is dangerous in production because:

Retraining is Frequent: Data changes, team members change, and infrastructure evolves, necessitating frequent model retraining.
Lack of Auditability: It becomes impossible to definitively state why a particular model was deployed, which is crucial for compliance and debugging.
No Safe Rollbacks: Reverting to a previous stable model version becomes a manual and risky process.

Expert Note: Excuses like “we’ll clean up the code later” or “this is just research” often lead to technical debt. Tracking doesn’t slow you down; it enhances future productivity.

Step 2: Setting Up MLflow Locally

This section guides you through installing MLflow and setting up a local tracking server.

Installation Steps

Create a Project Directory: Create a new folder for your MLflow project (e.g., MLflow_YouTube).
Open Terminal: Navigate to your project directory using your terminal or command prompt.
Create a Virtual Environment: It’s best practice to use a virtual environment to manage dependencies. Use Python’s built-in venv module:
python -m venv venv
Activate the Virtual Environment:
- On macOS/Linux: source venv/bin/activate
- On Windows: venvScriptsactivate
Install MLflow: With the virtual environment activated, install MLflow using pip:
pip install mlflow

Running the MLflow Tracking Server

MLflow provides a command-line interface (CLI) to manage its features. To start the local tracking server:

Run the MLflow Command: In your activated virtual environment and project directory, run:
mlflow server
Access the UI: By default, the server runs on http://127.0.0.1:5000. Open this URL in your web browser to access the MLflow UI.

Tip: When you run mlflow server, MLflow creates an mlruns directory in your project folder. This directory stores your experiment artifacts by default.

Step 3: Creating Your First Experiment and Run

Now that MLflow is set up, let’s create an experiment and log some data.

Creating an Experiment

You can set an experiment name using the MLflow Python API. If the experiment doesn’t exist, MLflow will create it.

Create a Python Script: Create a new Python file (e.g., lecture_2.py) in your project directory.
Import MLflow: Start by importing the MLflow library.
import mlflow
Set the Experiment: Use mlflow.set_experiment() to define your experiment.
mlflow.set_experiment("Demo Experiment")

Logging Parameters and Artifacts within a Run

Within an experiment, you can create multiple runs, each representing a single execution of your training process. You can log parameters, metrics, and artifacts for each run.

Start a Run: Use the with mlflow.start_run(): context manager. Any logging commands within this block will be associated with the current run.
with mlflow.start_run(run_name="My First Run"): # Log parameters and artifacts here
Log Parameters:
- Log individual parameters: mlflow.log_param("learning_rate", 0.01)
- Log a dictionary of parameters: params = {"epochs": 100, "batch_size": 32}; mlflow.log_params(params)
Log Artifacts: Artifacts are any files produced during a run (e.g., models, plots, data files). Use mlflow.log_artifact().
# Assuming you have a file named 'my_model.pkl'
mlflow.log_artifact("my_model.pkl")

Example Script (lecture_2.py):

import mlflow

# Set the experiment name
mlflow.set_experiment("Demo Experiment")

# Start a new run
with mlflow.start_run(run_name="My First Run"):
    # Log parameters
    mlflow.log_param("learning_rate", 0.01)
    mlflow.log_param("epochs", 100)
    
    # Log a dictionary of parameters
    more_params = {"batch_size": 32, "optimizer": "adam"}
    mlflow.log_params(more_params)
    
    # Log an artifact (e.g., a dummy file)
    with open("my_model.pkl", "w") as f:
        f.write("This is a dummy model file.")
    mlflow.log_artifact("my_model.pkl")

print("MLflow run completed.")

Viewing Runs in the MLflow UI

Run the Script: Execute your Python script (e.g., python lecture_2.py).
Refresh the UI: Go back to your MLflow UI (http://127.0.0.1:5000) and navigate to the “Experiments” tab. You should see “Demo Experiment” and within it, “My First Run”.
Explore Run Details: Click on the run name to see the logged parameters, metrics, and artifacts.

Tip: The mlruns directory now contains subdirectories for experiment IDs and run IDs, storing the actual artifact files.

Step 4: Understanding MLflow’s Backend and Artifact Stores

MLflow separates metadata storage (backend store) from file storage (artifact store).

Key Concepts

Backend Store: Stores metadata like parameters, metrics, tags, and run history. By default, when running locally, MLflow uses an SQLite database (mlflow.db) within the mlruns directory.
Artifact Store: Stores the actual files generated during a run (e.g., model weights, datasets, plots). By default, this is the mlruns directory itself.
Tracking Server: The service that hosts the MLflow UI and serves the MLflow APIs.

Exploring the Default Stores

You can inspect the contents of the default backend store using SQL queries.

Using SQLite and Pandas:

import sqlite3
import pandas as pd

# Connect to the MLflow backend database (default is mlflow.db)
# Note: This database is created when you first log data.
# If it doesn't exist, you might need to run a script first.
connection = sqlite3.connect("mlruns/mlruns.db") # Adjust path if needed

# Get all table names
tables_df = pd.read_sql_query("SELECT name FROM sqlite_master WHERE type='table';", connection)
print("Tables in the database:")
print(tables_df)

# Explore the 'runs' table
runs_df = pd.read_sql_query("SELECT * FROM runs;", connection)
print("nFirst 5 rows of the runs table:")
print(runs_df.head())

connection.close()

Best Practices for Production

Storing artifacts and metadata locally is suitable for development but not for production environments.

Remote Artifact Storage: Use cloud object storage like AWS S3, Google Cloud Storage, or Azure Blob Storage for your artifacts.
Remote Backend Store: Use a robust database like PostgreSQL, MySQL, or a managed database service for your backend store.

Configuring MLflow to use remote stores involves setting environment variables or using MLflow configuration files, which is a more advanced topic.

Step 5: Comprehensive Logging with MLflow

MLflow allows you to log various types of information to track your experiments thoroughly.

Logging Parameters

As demonstrated earlier, you can log individual parameters or entire dictionaries.

Individual Parameters: mlflow.log_param("key", "value")
Dictionary of Parameters: mlflow.log_params({"key1": "value1", "key2": "value2"})

Logging Metrics

Metrics are typically numerical values that change over time or represent performance indicators (e.g., accuracy, loss). Metrics can be logged at different steps during training.

Log a Single Metric: mlflow.log_metric("accuracy", 0.95)
Log Metrics with Stepping: Useful for plotting training progress. mlflow.log_metric("loss", 0.1, step=10) logs the metric at step 10.
Log a Dictionary of Metrics: Similar to parameters, mlflow.log_metrics({"precision": 0.92, "recall": 0.97}).

Logging Artifacts

Artifacts are files generated by your run. This includes:

Trained model files (e.g., .pkl, .h5, .pt)
Data files (e.g., processed datasets)
Images and plots (e.g., confusion matrices, training curves)
Configuration files
requirements.txt or environment files

Use mlflow.log_artifact("path/to/your/file") or mlflow.log_artifacts("path/to/directory").

Example: Comprehensive Logging Script

Here’s an example script demonstrating logging parameters, metrics, and artifacts.

import mlflow
import random
import os

# Set the experiment name
mlflow.set_experiment("YouTube Tutorial")

# Start a run
with mlflow.start_run(run_name="Comprehensive Logging Demo") as run:
    run_id = run.info.run_id
    print(f"Started run: {run_id}")

    # Log parameters
    params = {
        "learning_rate": 0.001,
        "epochs": 50,
        "batch_size": 64,
        "optimizer": "adam"
    }
    mlflow.log_params(params)

    # Log metrics (simulating training)
    for step in range(10):
        accuracy = 0.7 + (step * 0.02) + random.uniform(-0.01, 0.01)
        loss = 0.5 - (step * 0.03) + random.uniform(-0.01, 0.01)
        mlflow.log_metric("accuracy", accuracy, step=step)
        mlflow.log_metric("loss", loss, step=step)
        print(f"Step {step}: Accuracy={accuracy:.4f}, Loss={loss:.4f}")

    # Log an artifact (a dummy model file)
    model_filename = "model.pkl"
    with open(model_filename, "w") as f:
        f.write("Dummy model content")
    mlflow.log_artifact(model_filename)
    os.remove(model_filename) # Clean up dummy file

    # Log another artifact (a dummy plot image)
    plot_filename = "accuracy_plot.png"
    with open(plot_filename, "w") as f:
        f.write("Dummy plot content")
    mlflow.log_artifact(plot_filename)
    os.remove(plot_filename) # Clean up dummy file

    print(f"Finished run: {run_id}")

print("Comprehensive logging demo complete.")

Viewing Comprehensive Logs

After running the script, check the MLflow UI. You will see:

The “YouTube Tutorial” experiment.
The “Comprehensive Logging Demo” run.
Tabs for “Parameters”, “Metrics” (showing plots of accuracy and loss over steps), and “Artifacts” (listing model.pkl and accuracy_plot.png).

This structured approach ensures that all critical information from your ML experiments is captured, making them reproducible, auditable, and manageable.

Source: Learn MLOps with MLflow and Databricks – Full Course for Machine Learning Engineers (YouTube)

Leave a Reply Cancel reply

Written by

John Digweed

1,457 articles

Life-long learner.