Webinar
Ray Serve 101: Deploying your first ML model locally and as a managed service
Thursday, May 5 9 AM PDTRay Serve supports inference on CPUs, GPUs (even fractional GPUs!), and other accelerators – using just Python code.
In addition to single-node serving, Serve enables seamless multi-model inference pipelines (also known as model composition); autoscaling via Kubernetes, both locally and in the cloud; and integrations between business logic and machine learning model code. You can run Ray Serve applications on a single node, or on a cluster, with minimal to zero code changes.
Ray Serve is:
Framework-agnostic: Use a single toolkit to serve everything from deep learning models built with frameworks like PyTorch, Tensorflow, and Keras, to Scikit-Learn models, to arbitrary Python business logic.
Python-first: Configure your model serving declaratively in pure Python, without needing YAML or JSON configs.
Natively integrated with FastAPI, and supports any arbitrary Python web server.
By the end of the webinar, you will understand how to deploy a machine learning model either locally, or as a managed service on Anyscale (via AWS or GCP). No specialized machine learning knowledge is required to attend.
Join the discussion with fellow Ray and Managed Ray Serve users on the Ray forum and the Ray Slack.
LinkResources & Materials
Speakers

Paige Bailey
Product Management Lead, Anyscale, Anyscale

Simon Mo
Software Engineer, Anyscale, Anyscale

Edward Oakes
Software Engineer, Anyscale