MLflow 3.0: What Spark Scala Developers Need to Know

MLflow 3.0 (released June 2025) rebuilt the platform around LoggedModel as a first-class entity, added GenAI tracing on top of OpenTelemetry, and reorganized how artifacts are stored. Most of the headline features land in the Python and TypeScript SDKs, but the JVM tracking client is still the path Scala teams use to log Spark ML runs from production code — and the changes underneath it are worth knowing before you upgrade.

The Headline

MLflow 3.0 shipped on June 11, 2025 as the first major version bump in three years. The framing changed: MLflow 2.x was an ML lifecycle tool that grew GenAI features around the edges; MLflow 3 is pitched as a unified platform for traditional ML, deep learning, and GenAI under one set of tracking, registry, and evaluation APIs.

For Spark Scala developers, that pitch is mostly a Python and TypeScript story. The MLflow Java/Scala tracking client (org.mlflow:mlflow-client) is still the artifact that lets a Scala job emit runs, params, metrics, and tags to an MLflow server, and the mlflow-spark JAR is still how Spark datasource reads get auto-logged. The pieces that changed under the hood — LoggedModel, the new artifact layout, autologging without an active run — affect how that data looks when you query it back out, even if your write path stays in Scala.

If you're running Spark ML pipelines in Scala and writing to an MLflow tracking server, this article is for you. If you're using PySpark with the Python MLflow SDK, most of MLflow 3's surface area is more directly usable — but the JVM-side considerations still apply when the Spark cluster itself logs through mlflow-spark. For broader context on the JVM-and-Scala-still-matter argument, see Why JVM Scala Spark Still Makes Sense in 2026.

What Changed in MLflow 3.0

The shifts that matter for an existing tracking setup, in rough order of impact:

LoggedModel is now a first-class entity. In MLflow 2.x, a model was an artifact attached to a run. In MLflow 3, models are their own entity with their own ID, lineage, and lifecycle. You can call log_model without mlflow.start_run(), and the same model can be associated with multiple runs over time (training, evaluation, fine-tuning). The migration guide calls this out as the central architectural change.
Artifacts moved. Model files used to live at experiments/<id>/runs/<id>/artifacts/<artifact_path>. They now live at experiments/<id>/models/<model_id>/artifacts/. Anything that hard-coded the old layout — custom artifact UIs, S3 listing scripts, downstream tools that fetched via mlflow.get_artifact_uri("model") — will break and needs to use the URI returned by log_model() directly.
GenAI tracing built on OpenTelemetry. MLflow's new tracing layer instruments 20+ GenAI libraries (LangChain, LlamaIndex, OpenAI SDK, Anthropic SDK, PydanticAI, smolagents) and exports spans in OpenTelemetry format. If your observability stack already speaks OTel, MLflow traces flow into it natively.
Prompt Registry and LLM judges. Prompts get Git-style versioning with visual diffs; LLM judges automate quality evaluation of free-form text outputs. Both are useful if you build GenAI pipelines; both are irrelevant if you only train Spark MLlib models.
Removed flavors. mleap and fastai flavors are gone. The mleap removal is the relevant one for Spark teams — if you were exporting Spark ML pipelines as MLeap bundles for low-latency serving, that path is closed. The recommended replacement is the native Spark pyfunc flavor, which creates a local Spark session for inference.
MLflow Recipes removed. The Recipes framework is gone; migrate to plain tracking + registry, or to MLflow Projects if you need recipe-style packaging.
baseline_model parameter removed from evaluation. Replaced by mlflow.validate_evaluation_results for the same comparison workflow.

None of these break the Java client's public API — but they reshape what your tracking data looks like on the server and how downstream tools retrieve it.

The Scala/JVM Story

MLflow has had a Java tracking client since the early days. It's published to Maven Central as org.mlflow:mlflow-client and exposes the same CRUD operations as the Python client over the REST API:

// build.sbt
libraryDependencies ++= Seq(
  "org.mlflow" % "mlflow-client" % "3.1.0",
  "org.mlflow" % "mlflow-spark_2.13" % "3.1.0"
)

The client itself talks to the MLflow tracking server over HTTP and is version-agnostic in the sense that an mlflow-client 3.x release will happily log runs to an MLflow 2.x server and vice versa, as long as the operations you call exist on both sides. What changes between major versions is the surface area, not the wire protocol. The pattern is the same one Spark Connect uses for its Scala client — a thin JVM API translating into a stable wire format the server understands.

The minimal Scala usage looks like:

import org.mlflow.tracking.MlflowClient
import org.mlflow.api.proto.Service.RunStatus

val client = new MlflowClient("http://mlflow-server:5000")

val experimentId = client.getExperimentByName("/spark-ml/customer-churn")
  .map(_.getExperimentId)
  .orElseGet(() => client.createExperiment("/spark-ml/customer-churn"))

val run = client.createRun(experimentId)
val runId = run.getRunInfo.getRunId

try {
  client.logParam(runId, "regParam", "0.01")
  client.logParam(runId, "elasticNetParam", "0.5")
  client.logMetric(runId, "auc", 0.873)
  client.logMetric(runId, "f1", 0.812)
  client.setTag(runId, "git_commit", sys.env.getOrElse("GIT_COMMIT", "unknown"))

  client.setTerminated(runId, RunStatus.FINISHED)
} catch {
  case t: Throwable =>
    client.setTerminated(runId, RunStatus.FAILED)
    throw t
}

This pattern hasn't changed in MLflow 3. The MlflowClient API stayed stable across the major version, which is the right call — there's a long tail of JVM jobs in production logging through this client, and breaking them to push the new entity model into Java surface would have created a migration that nobody asked for.

What the Java client doesn't have, even on MLflow 3.x, is first-class support for the new MLflow 3 concepts: there's no LoggedModel builder, no tracing instrumentation, no Prompt Registry client. You can still write to those entities via the underlying REST API if you need to (the proto definitions are in org.mlflow.api.proto), but the ergonomic Java wrappers aren't there. For now, the JVM client is a tracking client first and an everything-else client a distant second.

Logging Spark ML Models From Scala

The MlflowClient handles params, metrics, tags, and arbitrary artifacts — but it doesn't have a built-in logSparkModel helper. The standard Scala approach is to save the Spark ML pipeline to a temp directory, then upload it as an artifact under the conventional path:

import org.apache.spark.ml.{Pipeline, PipelineModel}
import org.apache.spark.ml.classification.LogisticRegression
import org.apache.spark.ml.feature.{HashingTF, Tokenizer}
import org.mlflow.tracking.MlflowClient
import java.nio.file.Files

val tokenizer = new Tokenizer().setInputCol("text").setOutputCol("words")
val hashingTF = new HashingTF().setInputCol("words").setOutputCol("features")
val lr        = new LogisticRegression().setMaxIter(20).setRegParam(0.01)

val pipeline = new Pipeline().setStages(Array(tokenizer, hashingTF, lr))
val model    = pipeline.fit(trainingDF)

val client = new MlflowClient()
val run    = client.createRun(experimentId)
val runId  = run.getRunInfo.getRunId

// Save the Spark ML pipeline locally, then upload as an artifact
val localDir = Files.createTempDirectory("spark-ml-model").toFile
model.write.overwrite().save(localDir.getAbsolutePath + "/sparkml")
client.logArtifacts(runId, localDir, "model")

client.logParam(runId, "regParam", "0.01")
client.logParam(runId, "maxIter", "20")
client.logMetric(runId, "areaUnderROC", evaluator.evaluate(model.transform(testDF)))

client.setTerminated(runId, org.mlflow.api.proto.Service.RunStatus.FINISHED)

The artifact path "model" is the convention mlflow.spark.log_model() uses on the Python side, so a run logged this way from Scala can later be loaded with mlflow.spark.load_model("runs:/<run-id>/model") from Python — useful for cross-language ML platforms where training happens in Scala but serving happens in PyFunc.

The MLflow 3 wrinkle: with LoggedModel as a separate entity on the server side, models logged this way from Scala still attach to the run as an artifact under the legacy layout. The server may or may not promote them into the new models/<model_id>/ storage depending on how your tracking server was upgraded. If you're on a fresh MLflow 3 server, expect the artifact to live at the new path; if you upgraded from 2.x, both layouts coexist and the artifact URI returned by the API tells you which.

The honest read: the Java client lags behind on the LoggedModel concept. If you need first-class LoggedModel semantics from a JVM job today, log the artifact from Scala and then run a small Python helper that calls mlflow.register_model to promote it into the registry with the new metadata. It's an extra step, but it bridges the gap until the JVM client catches up.

The `mlflow-spark` JAR: Datasource Autologging

Separate from the MlflowClient you call explicitly, MLflow ships an mlflow-spark JAR that registers as a Spark listener and auto-logs datasource reads — file paths, formats, and versions for every read your job performs. This gets attached to the active MLflow run as tags, giving you data lineage without explicit instrumentation.

The Scala version match is non-negotiable: the JAR you pick has to match the Scala version Spark itself was compiled against. For Spark 3.4.1 on Scala 2.13, that's mlflow-spark_2.13. Mismatches surface as ClassNotFoundException or NoSuchMethodError deep in the listener registration path — confusing because the rest of your job runs fine and only the lineage logging breaks.

// build.sbt — match the Scala version your Spark cluster runs
val sparkVersion = "3.4.1"
val mlflowVersion = "3.1.0"

libraryDependencies ++= Seq(
  "org.apache.spark" %% "spark-sql"   % sparkVersion % Provided,
  "org.apache.spark" %% "spark-mllib" % sparkVersion % Provided,
  "org.mlflow"        % "mlflow-client"      % mlflowVersion,
  "org.mlflow"        % "mlflow-spark_2.13"  % mlflowVersion
)

The mlflow-spark listener is registered by the Python side via mlflow.spark.autolog(). On a pure-Scala job there's no equivalent toggle — the JAR has to be on the classpath, and you enable the listener manually:

val spark = SparkSession.builder()
  .appName("churn-training")
  .config("spark.extraListeners", "org.mlflow.spark.autologging.SparkDataSourceListener")
  .getOrCreate()

Once active, every spark.read.parquet("s3://...") or spark.read.format("delta").load(...) call attaches mlflow.log.dataset tags to the active run. For ML pipelines that touch a dozen tables across training, the resulting lineage trail is genuinely useful when reproducing a result six months later.

One MLflow 3 caveat the documentation calls out: the autologging listener is not supported on Databricks shared or serverless clusters. The classloader isolation on those cluster types prevents the listener from seeing the catalog metadata it needs. If your Spark Scala jobs run on Databricks shared compute, you'll need to log datasets explicitly via setTag from Scala — the autolog path won't work.

What This Means for Spark Scala Teams

Practical guidance, broken down by where your team is:

If you're already on MLflow 2.x with a Scala tracking setup: Upgrading the server to MLflow 3 is mostly safe — the Java client's API didn't change, and the wire protocol is backward compatible. Audit any code that hard-codes the run-artifact path (looks like runs:/<id>/<artifact_path> or constructs S3 URIs from run IDs) before flipping the server, since the new storage layout breaks those patterns. The MLflow 3 breaking changes guide is the authoritative reference.

If you're using mleap for low-latency serving: Plan a migration. The flavor is removed from MLflow 3, so even if you keep Spark and MLflow 2.x for now, the door's closing. The realistic replacements are: keep Spark for serving via the native flavor (high overhead, simple operationally), or export to ONNX (some Spark ML estimators support it, many don't) for true low-latency JVM-free inference.

If you're starting a new Spark ML platform in 2026: Skip MLflow 2 entirely and start on 3.x. The artifact layout is cleaner, the OpenTelemetry tracing path gives you future-proof observability, and the LoggedModel concept is what new tooling will assume. Use the Java client for the tracking write path and accept that some MLflow 3 features (LoggedModel-aware operations, tracing) require a Python helper today.

If your team does GenAI work alongside Spark ML: MLflow 3's value proposition is highest here — one platform for both. The Spark side stays on the Java client; the GenAI side is Python (or TypeScript). Tracing across both worlds requires mlflow.tracking.set_tracking_uri() from each language to point at the same server, but that's a one-line config not a structural change.

Should You Upgrade?

Yes — but with two conditions.

First, treat MLflow 3 server upgrades like any storage layout migration. Run on staging, exercise the artifact retrieval paths that downstream services depend on, and verify your model registry's external integrations (CI/CD promotion gates, serving infrastructure, Unity Catalog if you're on Databricks) understand the new URIs. The actual upgrade is pip install mlflow==3.x and re-running migrations; the hard part is the long tail of code that knows where artifacts "should" be.

Second, accept that the JVM client is a follower, not a leader, in this release cycle. The big MLflow 3 features that drove the major version bump — LoggedModel as a first-class entity, GenAI tracing, the Prompt Registry — land first in Python. The Java client gets you the tracking surface and not much else. That's been the historical pattern for MLflow's polyglot story, and MLflow 3 didn't change it. If you need LoggedModel-aware code paths from Scala today, you'll be either bridging to Python helpers or talking to the REST API directly. If you're patient, JVM client parity tends to follow within a year or two of the corresponding Python release.

The good news for Scala teams already invested in Spark ML: the parts you actually use — the tracking client, the mlflow-spark JAR, the Java SDK surface — keep working through the upgrade. The platform around them is moving in a direction that genuinely helps teams running mixed ML and GenAI workloads, and the Spark Scala path is along for the ride.

For the full reference, see the MLflow 3 release announcement, the Databricks unified platform post, the Spark MLlib integration guide, and the MLflow 3 migration guide.