HomeBlogBlog Detail

Introducing Label Selectors: Improved Scheduling Flexibility in Ray

By The Anyscale Team and The Google GKE Team   |   October 29, 2025

Thanks everyone who contributed to the project, including: Mengjin Yan, Edward Oakes, Janet Li, Bruce Zhang, Alan Guo, Douglas Strodtman, Jui-An Huang, Han-Ju Chen — from Anyscale — and Ryan O'Leary, Andrew Sy Kim — from Google, GKE team.

Today, in partnership with the Google Kubernetes Engine team, we’re introducing label selectors to Ray – a more straightforward way to place workloads on the right nodes. The API and UI ship in Ray 2.49 and are available across Ray Dashboard, KubeRay and Anyscale, the AI compute platform built on Ray.

With the new Label Selector API, Ray now directly helps developers accomplish things like: 

  • Assign labels to nodes in your Ray cluster (e.g. cpu-family=intel, market-type=spot, region=us-west-1)

  • When launching tasks, actors or placement groups declare things like “Run this ONLY on intel chip nodes”  

With this new API, teams get a better development experience with the ability to more easily describe scheduling intent with labels, not hacks (more details on this below) and more easily debug placement with UI and API support from day one. 

LinkThe old workaround

When Ray developers want to only schedule some of their tasks on the spot nodes in their Ray cluster, they often fake a “spot” custom resource when starting a Ray node, and then request a tiny fraction (0.01) per task -- hoping no more than 100 tasks co-locate on a node at the same time to run out of the “spot” resource.

1ray start --resources '{"CPU": 8, "spot": 1}'
1@ray.remote(resources={"spot": 0.01})
2def func():
3    pass

The above workaround highlights one class of scheduling limitations: non-resource constraints (spot vs. on-demand) being faked as “resources”. This conflates resource quantities (CPU/GPU/RAM) with placement constraints (region/zone, spot vs. on-demand) and forces non-resource needs to be modeled as resources.

A second and equally common limitation is exact-match only: many Ray developers need any-of or negative matches -- e.g., “not on a GPU node” or “in us-west1-a or us-west1-b” -- that prior APIs can’t express.

With label selectors, Ray developers are able to address both limitations. They let you flexibly express the scheduling requirements for tasks, actors, and placement group bundles using node labels defined at RayCluster creation time or auto-detected by Ray (e.g. accelerator type). 

Ray label selectors draw inspiration from Kubernetes labels and selectors, enhancing scheduling interoperability between the two systems by using familiar APIs and semantics. This is one of many on-going initiatives where Ray and Kubernetes work in concert to unlock more advanced use cases.

What you can do with labels:

  • Pin to a specific node (by node id)

  • Run on/avoid the head node or a worker group

  • CPU-only placement 

  • Target a specific accelerator or set

  • Choose node market type (spot or on-demand)

  • Keep work in/out of regions or zones

  • Select by CPU family (or other host trait)

  • Target TPU pod slices

See the Anyscale and Ray guides for usage instructions and full API details.

LinkLabel Selector API

LinkLabel selectors in user code

Label selectors let you express where work should run by matching against node labels. 

You can add a label selector to:

  • Tasks/actors: `label_selector` in `@ray.remote(...)`

  • Placement groups: `bundle_label_selector=[...]` at creation time

Supported semantics: match, not match, any-of, not-any-of

Going back to the examples above, below are how you’d express those requirements:

1import ray
2from ray.util.placement_group import placement_group
3
4# Schedule a task on spot instances only
5@ray.remote(label_selector={"ray.io/market-type": "spot"})
6def func():
7    pass
8
9# Placement group: avoid GPU worker group and run in us-west-1 or us-west-2
10pg = placement_group(
11    [{“CPU”: 1}] * 2, 
12    bundle_label_selector=[
13        {
14            "ray.io/node-group": "!gpu-node-group", 
15            "ray.io/region": "in(us-west-1,us-west-2)"
16        }
17    ] * 2,
18)
19

The same code can run portably across open source Ray, on Kubernetes, or hosted providers like Anyscale.

LinkNode Labels

Node labels annotate nodes with properties used for scheduling (e.g., cpu-family, accelerator-type, market-type for spot/on-demand, region/zone). You can:

Define labels when creating the cluster (Anyscale UI/SDK, KubeRay RayCluster CR, or `ray start`).

1# KubeRay RayCluster CR: Label nodes in a worker group with AMD CPU
2spec:
3 workerGroupSpecs:
4  - replicas: 1
5    labels:
6      cpu-family: amd
7
1# ray start: Label a node with AMD CPU
2ray start --labels="cpu-family=amd"

On Anyscale UI: Label nodes in a worker group with AMD CPU

label selector - image 1
label selector - image 1

Rely on auto-detected labels that Ray and Anyscale provide (e.g., accelerator type). On Anyscale, additional default labels (market type, region/zone) are populated to enable more flexible placement. See docs to learn more about the default labels. The list of the default labels:

Label Name

Description

Supported in Anyscale

Supported in OSS

ray.io/node-id

A unique ID generated for the node.

Yes

Yes

ray.io/accelerator-type

The accelerator type of the node, for example L4. 

Yes

Yes

ray.io/market-type

Indicates whether the node uses spot instances or on-demand instances.

Yes

No1

ray.io/node-group

The name of the node worker group or head for the head node.

Yes

No1

ray.io/region

The cloud region of the node.

Yes

No1

ray.io/availability-zone

The available zone of the node.

Yes

No1

Note: 1.  The default labels won’t be populated by default. But you can still use the labels if they are specified in the ray start parameters or the top-level `labels` field in KubeRay CR. 

LinkAutoscaling cluster

Label selectors work with both static and autoscaling clusters. 

On Anyscale, the autoscaler considers both resource shape and label selectors, scaling the appropriate worker groups so it brings up nodes with the required labels. Autoscaling support for label selectors in open source Ray will be released in Ray 2.51.

LinkHow does it work?

When you submit a task or create an actor or placement-group, Ray attaches the corresponding label selector (label_selector/bundle_label_selector) to the request. Each node’s Raylet holds the nodes’ labels (user-defined at RayCluster creation or auto-detected, e.g., accelerator family) along with other node metadata from all nodes. The scheduler in the Raylet uses both the selector and the node-label information to make the placement decision: it checks both the label match and performs normal resource fit. The request is deemed schedulable to a node only when both checks pass.

In autoscaling clusters, the Ray global control store (GCS) aggregates and sends the unsatisfied demand with both the label selector as well as the resource requirements to the autoscaler. The autoscaler uses the information to scale the appropriate worker groups so it brings up nodes with the required labels.

LinkFuture work 

The current label selector is a good starting point, but we have more plans to make it better. 

  1. Fallback label selectors (ordered preferences): today, each request carries a single label selector. We plan to support ordered fallback per task/actor/placement group bundles so you can express ordered intent, e.g. prefer accelerator=H100, else accelerator in {A100,H100}.

  2. Library support: Extend label selector support into the Ray libraries (e.g., Ray Data, Ray Serve, Ray Train) so common scheduling patterns will be handled automatically.

  3. Improved interoperability with Kubernetes: We want to make it as easy as possible to inherit labels from Kubernetes pods and influence pod scheduling in a Kubernetes-native way. 

LinkTry it Today

  1. Ray OSS guide

  2. Anyscale guide

  3. Kuberay end-to-end example: https://docs.ray.io/en/master/cluster/kubernetes/user-guides/label-based-scheduling.html



Ready to try Anyscale?

Access Anyscale today to see how companies using Anyscale and Ray benefit from rapid time-to-market and faster iterations across the entire AI lifecycle.