MLproject Files: Recipes for Training

A YAML that says: here is the entry point, the env, the params.

0/2 done

The recipe file

MLproject in one screen

An MLproject file lives at the root of your repo and tells mlflow run . three things:

  1. Which environment to build (conda / virtualenv / Docker).
  2. Which entry points exist (main, eval, score, …).
  3. Which parameters each entry point accepts, with types and defaults.
name: churn-model
python_env: python_env.yaml

entry_points:
  main:
    parameters:
      lr:         {type: float,  default: 0.01}
      batch_size: {type: int,    default: 64}
      data_uri:   {type: string, default: 's3://bucket/snap.parquet'}
    command: 'python train.py --lr {lr} --batch-size {batch_size} --data {data_uri}'

Anyone can now run::

mlflow run . -P lr=0.05 -P batch_size=128

and get a reproducible run, including the exact env, on their laptop or on a Databricks/Kubernetes backend.

Analogy

An MLproject is a cocktail recipe card. It tells the bartender (the runner) which glassware (env), which bottles (params), and the exact pouring order (command). Hand the card to any bartender in any bar in the world and they will make the same drink.

Reading in progress · 0 of 2 activities done