MLflow - 데이터 분석 > 데이터 엔지니어링 | AI Insight Note

MLflow는 Databricks가 오픈소스로 공개한 ML 실험 관리 플랫폼이다. 실험 추적, 모델 패키징, 모델 레지스트리, 모델 서빙을 통합 지원하며 프레임워크에 독립적이다.

핵심 구성 요소

컴포넌트	역할
MLflow Tracking	파라미터, 메트릭, 아티팩트 기록
MLflow Projects	재현 가능한 실험 패키징
MLflow Models	표준 모델 포맷
MLflow Registry	모델 버전 관리

실험 추적

python

import mlflow
import mlflow.sklearn
from sklearn.ensemble import RandomForestClassifier
from sklearn.metrics import accuracy_score

mlflow.set_experiment("my-experiment")

with mlflow.start_run():
    # 하이퍼파라미터 로깅
    n_estimators = 100
    max_depth = 5
    mlflow.log_params({"n_estimators": n_estimators, "max_depth": max_depth})

    # 모델 학습
    model = RandomForestClassifier(n_estimators=n_estimators, max_depth=max_depth)
    model.fit(X_train, y_train)

    # 메트릭 로깅
    acc = accuracy_score(y_test, model.predict(X_test))
    mlflow.log_metric("accuracy", acc)

    # 모델 저장
    mlflow.sklearn.log_model(model, "model")
    print(f"Accuracy: {acc:.4f}")

모델 레지스트리

python

from mlflow.tracking import MlflowClient

client = MlflowClient()

# 모델 등록
mlflow.register_model("runs:/abc123/model", "ProductionModel")

# 스테이징 → 프로덕션 전환
client.transition_model_version_stage(
    name="ProductionModel",
    version=3,
    stage="Production"
)

# 프로덕션 모델 로드
model = mlflow.sklearn.load_model("models:/ProductionModel/Production")

MLflow UI

bash

mlflow ui --host 0.0.0.0 --port 5000
# 브라우저에서 http://localhost:5000 접속

MLflowMLflow

핵심 구성 요소

실험 추적

모델 레지스트리

MLflow UI

관련 개념