HomeEventsRay Serve 101: Deploying your first ML model locally and as a managed service

Webinar

Ray Serve 101: Deploying your first ML model locally and as a managed service

Ray Serve supports inference on CPUs, GPUs (even fractional GPUs!), and other accelerators – using just Python code.

In addition to single-node serving, Serve enables seamless multi-model inference pipelines (also known as model composition); autoscaling via Kubernetes, both locally and in the cloud; and integrations between business logic and machine learning model code. You can run Ray Serve applications on a single node, or on a cluster, with minimal to zero code changes.

Ray Serve is:

  • Framework-agnostic: Use a single toolkit to serve everything from deep learning models built with frameworks like PyTorch, Tensorflow, and Keras, to Scikit-Learn models, to arbitrary Python business logic.

  • Python-first: Configure your model serving declaratively in pure Python, without needing YAML or JSON configs.

  • Natively integrated with FastAPI, and supports any arbitrary Python web server.

By the end of the webinar, you will understand how to deploy a machine learning model either locally, or as a managed service on Anyscale (via AWS or GCP). No specialized machine learning knowledge is required to attend.

Join the discussion with fellow Ray and Managed Ray Serve users on the Ray forum and the Ray Slack.

LinkResources & Materials

Speakers

paige-headshot

Paige Bailey

Product Management Lead, Anyscale, Anyscale

Simon Mo

Simon Mo

Software Engineer, Anyscale, Anyscale

Edward Oakes

Edward Oakes

Software Engineer, Anyscale

Other Events

Scaling Robot Policy Evaluations to Thousands of Parallel Simulations

07 . 22 . 2026  ,  03:30 PM (PST)

Anyscale on Azure: Build and deploy AI at scale in your own tenant

06 . 16 . 2026  ,  03:30 PM (PST)

How Torc Robotics Scales Multimodal AI for Autonomous Driving with Ray

06 . 10 . 2026  ,  03:30 PM (PST)