Working with MLFlow

Anurag Chatterjee
3 min readJan 23, 2021

--

This post will show how MLFlow can be used for managing your machine learning life cycle. I will walk through the major components of MLFLow and how they can be used. I have installed MLFlow via pip and am using the SQLite backend in order to perform the model registration.

IRIS classification with MLFlow

The below script classifies flowers from the IRIS dataset while leveraging the features of MLFlow one will typically use in their model development lifecycle.

Experiments and runs

An experiment is a container for different runs and it is useful to have a meaningful name for the experiment. Also, 2 experiments cannot have the same name, hence as shown above it might be required to get the experiment by name and use its ID to register the run, if the experiment already exists.

Experiment with one run

Logging parameters, metrics, and model artifacts

A run can consist of a complete flow with a specific machine learning algorithm and its hyper parameters. The run can contain the metrics associated with the model that was trained and also the model artifacts.

Parameters, metrics, and artifacts for a specific run

Model registration

In order to maintain the lineage of the machine learning model and to govern models that are deployed to production, models should be registered in MLFLow. Do note that model registration does not work with the local file system backend and hence I used the SQLite backend which is also easy to set up.

Registered models

Model scoring

Models that are managed by MLFlow can be used for batch prediction on a Pandas data frame or can be served as a REST API for real-time prediction based on the request that is sent to the API.

Batch mode

The batch mode prediction is shown in the above code where the Pyfunc model is first loaded and then the test data frame is passed to the predict function to generate the predictions

REST API via MLFlow model serving

The REST-based model serving is a turnkey (ready-to-go) solution from MLFlow, where the user only needs to issue a CLI command which creates a Conda environment via which the model can be served as a REST API. In the case of the IRIS classification, the below was the command that I used:

mlflow models serve -m mlruns/1/b6812f6de81a4e74b65cf50f9d019924/artifacts/model/

Please note that while issuing this command, do not run MLFlow UI in port 5000 as the models are also served on the same port by default. Also, the string starting with ‘b6812' needs to be replaced with the run id that is generated by MLFlow while performing the run. This will ideally be the run that generates the best model based on the metrics defined for the particular business problem.

The deployed model can be used to make predictions as is shown from the below screenshot from Postman

API request and response from the deployed IRIS model

For further details please read the MLFlow official documentation.

--

--

Anurag Chatterjee
Anurag Chatterjee

Written by Anurag Chatterjee

I am an experienced professional who likes to build solutions to real-world problems using innovative technologies and then share my learnings with everyone.

No responses yet