MLflow
What is MLflow?
MLflow is a powerful platform for managing the machine learning lifecycle, including tracking experiments, managing model versions, and deploying models to production. In Ilum, MLflow is seamlessly integrated to provide advanced functionalities for data scientists and engineers. This documentation details the features, integration steps, and usage of MLflow within the Ilum ecosystem.
Features of MLflow
Experiment Tracking
- Log parameters, metrics, and artifacts for every machine learning experiment.
- Visualize and compare experiment results in a centralized UI.
- Automatically store experiment metadata in Ilum's integrated storage solutions.
Model Registry
- Manage models with version control.
- Transition models through lifecycle stages: development, staging, and production.
- Integrate directly with Ilum deployment pipelines.
Setting Up MLflow in Ilum
Helm Deployment
To enable MLflow in Ilum, use the following Helm command:
helm upgrade \ --set mlflow.enabled=true \ --reuse-values ilum ilum/ilum
This command will:
- Deploy the MLflow and MLflow tracking server.
- Integrate MLflow with Ilum's storage and tracking services.
- Enable MLflow Tracking UI and Model Registry.
Using MLflow in Ilum
Configuration Setup
Global Configuration
To enable MLflow across the current Ilum instance:
- Update the Spark image in the cluster settings to
ilum/spark:3.5.3-mlflow
or a newer version.
Python/Scala Jobs
Add the following parameter in the service configuration:
spark.kubernetes.container.image: ilum/spark:3.5.3-mlflow
Jupyter Notebook
Add the following parameter in the properties JSON under the conf
array:
"spark.kubernetes.container.image": "ilum/spark:3.5.3-mlflow"
Regardless of the environment, set the Tracking URL to: http://ilum-mlflow-tracking:5000.
Tracking Experiments
MLflow tracking is preconfigured in Ilum. By providing the correct Tracking URL, all runs and experiments are automatically tracked.
- Log metrics and parameters in your machine learning scripts:
import mlflow
import mlflow.sklearn
with mlflow.start_run():
mlflow.log_param("param1", 5)
mlflow.log_metric("accuracy", 0.85)
mlflow.sklearn.log_model(model, "model")
Managing Models
-
Register a model in the MLflow Model Registry:
from mlflow.tracking import MlflowClient
client = MlflowClient()
run_uri = "runs:/<RUN_ID>/model"
model_name = "dev.MLTeam.model-name"
client.create_registered_model(model_name)
mv_src = client.create_model_version(model_name, run_uri, "experiment") -
Assign aliases to model versions:
client.set_registered_model_alias(model_name, "champion", "2")
Conclusion
MLflow's integration into Ilum provides a robust solution for managing machine learning workflows. From experiment tracking to model deployment, Ilum enhances MLflow's capabilities with seamless integration into its infrastructure. Use the steps and configurations outlined here to get started with MLflow in Ilum.