Building a scalable ML model serving API with Ray Serve

Thursday, September 9 9 AM PDT

Ray Serve is a framework-agnostic and Python-first model serving library built on Ray. In this introductory webinar on Ray Serve, we will highlight how Ray Serve makes it easy to deploy, operate and scale a machine learning API.

The core of the webinar will be a live demo that shows how to build a scalable API using Natural Language Processing models.

The demo will show how to:
- Deploy a trained Python model and scale it to a cluster using Ray Serve
- Improve the HTTP API using Ray Serve’s native FastAPI integration
- Compose multiple independently-scalable models into a single model, and run them in parallel to minimize latency.

LinkView slides >>>

Speakers

Tricia Fu

Product Manager, Anyscale

Tricia is currently a Product Manager at Anyscale. Before that, she spent some time at Google as a Product Manager and LinkedIn as a Software Engineer. She holds a BS degree from UC Berkeley in Electrical Engineering and Computer Science. In her free time, she loves taking her dog on alpine lake hikes!

Other Events

Live Virtual Hands On Lab: Distributed Training at Scale with Ray and PyTorch

03 . 17 . 2026 , 03:30 PM (PST)

From Prototype to Production: Securely Accelerating Physical AI with Vision-Language-Action (VLAs) Models

03 . 05 . 2026 , 04:30 PM (PST)

Scaling LLM Fine-Tuning with FSDP, DeepSpeed, and Ray

02 . 19 . 2026 , 04:30 PM (PST)