Through a demo using XGBoost for classification, we will demonstrate how you can scale training, hyperparameter tuning, and inference with Ray — all from a single node to a cluster, with tangible improvements in performance.
Modern machine learning (ML) workloads, such as deep learning and large-scale model training, are compute-intensive and require distributed execution. Ray is an open-source, distributed framework from UC Berkeley’s RISELab that easily scales Python applications and ML workloads from a laptop to a cluster, with an emphasis on the unique performance challenges of ML/AI systems. It is now used in many production deployments.
This talk will give an overview of Ray and Ray's architecture, core concepts, and primitives, such as remote Tasks and Actors; briefly discuss Ray's native libraries (Ray Tune, Ray Train, Ray Serve, Ray Datasets, RLlib); and dive into Ray’s growing ecosystem.
Key takeaways:
Learn Ray architecture, core concepts, and Ray primitives and patterns
Find out why distributed computing will be the norm, not an exception
See how to scale your ML workloads with Ray libraries:
Training on a single node vs. Ray cluster, using XGBoost with/without Ray
Hyperparameter search and tuning, using XGBoost with Ray Tune
Inferencing at scale, using XGBoost with/without Ray
Jules S. Damji is a lead developer advocate at Anyscale and an MLflow contributor. He is a hands-on developer with over 20+ years of experience and has worked at leading companies, such as Sun Microsystems, Netscape, @Home, Opsware/Loudcloud, VeriSign, ProQuest, Hortonworks, and Databricks, building large-scale distributed systems. He holds a B.Sc and M.Sc in Computer Science (from Oregon State University and Cal State, Chico respectively), and an MA in Political Advocacy and Communication (from Johns Hopkins University).