One command, REST endpoint
The simplest possible production
mlflow models serve -m 'models:/credit-scoring/Production' -p 5000
Under the hood MLflow spins up a Flask/FastAPI server that:
- loads the model in its
pyfuncflavor, - restores the recorded environment (
python_env.yaml), - enforces the signature on inbound payloads,
- exposes
/invocations(and/ping).
This is fine for staging and small-scale traffic. For real production you'll wrap it (Docker), scale it (KServe / SageMaker / Azure ML) and put it behind an API gateway — but the loaded model is identical to the one you served locally.