Modern AI pipelines move fast, but visibility often lags behind. As workloads scale across distributed Ray clusters, teams lose track of which datasets feed which models, which job produced a specific version, and how changes impact downstream systems. Debugging turns into a manual search through logs and tags, and reproducing a past run can stall entire releases. At Ray Summit this week, we announced Lineage Tracking, built on OpenLineage, that maps datasets and models across Ray workloads with native integrations to Unity Catalog, MLflow, and W&B, delivering clearer pipeline transparency, easier debugging, and faster iterations.
Why lineage matters in AI pipelines
Map the right dataset and model to the right Workspace, Job, and Service that produced or consumed it.
Visualize the pipeline in lineage graphs
Click through to the exact job or service to debug
Iterate and reproduce runs with captured params/env
Integrate natively with Unity Catalog, MLflow, W&B
Built on open standards, take advantage of the OpenLineage ecosystem to connect to any catalog or registry.
Platform/ML Engineers, data engineers, and MLOps leads running Ray workloads who need reproducibility, auditing, and faster incident/debug loops for their data pipelines and model development workflows.
Reserve your spot today!