MLflow

What is MLflow?

MLflow is a powerful platform for managing the machine learning lifecycle, including tracking experiments, managing model versions, and deploying models to production. In Ilum, MLflow is seamlessly integrated to provide advanced functionalities for data scientists and engineers. This documentation details the features, integration steps, and usage of MLflow within the Ilum ecosystem.

Features of MLflow

Experiment Tracking

Log parameters, metrics, and artifacts for every machine learning experiment.
Visualize and compare experiment results in a centralized UI.
Automatically store experiment metadata in Ilum's integrated storage solutions.

Model Registry

Manage models with version control.
Transition models through lifecycle stages: development, staging, and production.
Integrate directly with Ilum deployment pipelines.

Setting Up MLflow in Ilum

Helm Deployment

To enable MLflow in Ilum, use the following Helm command:

helm upgrade \   --set mlflow.enabled=true \    --reuse-values ilum ilum/ilum

This command will:

Deploy the MLflow and MLflow tracking server.
Integrate MLflow with Ilum's storage and tracking services.
Enable MLflow Tracking UI and Model Registry.

Ilum

Using MLflow in Ilum

Configuration Setup

Global Configuration

To enable MLflow across the current Ilum instance:

Update the Spark image in the cluster settings to ilum/spark:3.5.3-mlflow or a newer version.

Python/Scala Jobs

Add the following parameter in the service configuration:

spark.kubernetes.container.image: ilum/spark:3.5.3-mlflow

Jupyter Notebook

Add the following parameter in the properties JSON under the conf array:

"spark.kubernetes.container.image": "ilum/spark:3.5.3-mlflow"

Regardless of the environment, set the Tracking URL to: http://ilum-mlflow-tracking:5000.

Tracking Experiments

MLflow tracking is preconfigured in Ilum. By providing the correct Tracking URL, all runs and experiments are automatically tracked.

Log metrics and parameters in your machine learning scripts:

import mlflow
import mlflow.sklearn

with mlflow.start_run():
    mlflow.log_param("param1", 5)
    mlflow.log_metric("accuracy", 0.85)
    mlflow.sklearn.log_model(model, "model")

Managing Models

from mlflow.tracking import MlflowClient

client = MlflowClient()
run_uri = "runs:/<RUN_ID>/model"
model_name = "dev.MLTeam.model-name"
client.create_registered_model(model_name)
mv_src = client.create_model_version(model_name, run_uri, "experiment")

Assign aliases to model versions:

client.set_registered_model_alias(model_name, "champion", "2")

Conclusion

MLflow's integration into Ilum provides a robust solution for managing machine learning workflows. From experiment tracking to model deployment, Ilum enhances MLflow's capabilities with seamless integration into its infrastructure. Use the steps and configurations outlined here to get started with MLflow in Ilum.

What is MLflow?​

Features of MLflow​

Experiment Tracking​

Model Registry​

Setting Up MLflow in Ilum​

Helm Deployment​

Using MLflow in Ilum​

Configuration Setup​

Global Configuration​

Python/Scala Jobs​

Jupyter Notebook​

Tracking Experiments​

Managing Models​

Conclusion​