HomeBlogBlog Detail

Reinforcement learning sessions at Ray Summit: A guided tour

By Avnish Narayan, Christy Bergman and Jun Gong   
raysummit-blog-image

Reinforcement learning (RL) is gaining traction as a complementary approach to supervised learning, with applications ranging from recommender systems to games to production planning. Whether you’re an experienced RL practitioner or you’re just starting to experiment with RL, here are a few of the RL sessions at this year’s Ray Summit that we think you will be excited about!

LinkGetting a kick out of RLlib at FIFA World Cup 2022

Organizing the FIFA World Cup is a major challenge for Qatar transport authorities: the event spans eight stadiums and will be attended by thousands of people over the course of a month, all within 20 miles of a busy city center. To control congestion and facilitate mobility at the event, the Qatar Computing Research Institute (QCRI) is working with local transport authorities to model the traffic demands and develop a traffic control policy — and RL is a key piece of the puzzle. Catch the QCRI team’s session to learn how they used RLlib to speed up multi-agent learning for coordinated traffic light control. 

LinkEfficient game-playing bots with supervised deep learning

Gaming is a common use case for RL. Cutting-edge gaming companies like Riot Games use large-scale deep RL to build bots that play games at various skill levels to provide feedback to game designers to ensure the best experiences for players. But how can you get the most out of training time while the bots are learning to play the game? Wesley Kerr from Riot Games will share how his team leveraged supervised large neural networks in the Offline RL process to reduce training server time for the game Team Fight Tactics. He will also share how they simplified and scaled their RL framework with RLlib, Ray Data, Ray Train, and Ray Tune.

LinkHarnessing the power of the wind with RL

Computational fluid dynamics simulation: it can be complicated and time-consuming, but getting it right can have real-world rewards. In collaboration with Microsoft and Vestas, minds.ai developed RL-based wind farm controllers that boost annual energy production by 1-2%. The algorithm does this by adjusting the yaw of upstream wind turbines to minimize wake losses at downstream turbines. Don’t miss the minds.ai team’s talk, where they’ll share how they used their end-to-end machine learning platform DeepSim, built on Ray, to train the wind farm controllers in the cloud using up to 15,000 CPU cores in parallel.

LinkUnderstanding the “why” with counterfactual explanation in real time

If you apply for a loan and don’t get it, you’ll probably want to know why, and what steps you can take to get the loan next time. What you need in this case is a counterfactual explanation: If X had not occurred, Y would not have occurred. Counterfactuals are explanations that produce actionable steps to move a data point from one side of a decision boundary to another — and as it turns out, it’s another problem that RL can help to solve. Karthik Rao and Sahil Verma will share how they used OpenAI Gym and RLlib to develop an RL-based, real-time counterfactual explainer called FastCFE. 

LinkFaster MILP production scheduling at Dow

Production scheduling is integral to Dow’s supply chain management, helping enable better and faster decision making that positively impacts customers, financial performance, and shareholder value. The traditional approach, mixed integer linear programming (MILP) optimization, is computationally expensive (NP-hard or at least exponential runtime). Adam Kelloway from Dow will share about the project AlphaDow, which uses multi-agent decomposition in RLlib, splitting the scheduling problem between two hierarchical agents, and significantly ​​reducing the solution time for MILP problems. 

LinkThe latest in RLlib, straight from the RLlib team

Finally, don’t miss the sessions from the RLlib team.

LinkWhat’s new in RLlib?

In this 30-min talk, Jun Gong from the RLlib team will go over some of RLlib 2.0’s biggest improvements, including:

  • A new way of configuring supported algorithms

  • Revamped core algorithm logics using plain Python implementations

  • Restructured algorithm and policy implementations that are easier to extend and customize

  • A feature that consolidates all user environment interactions to make restoring and serving RLlib policies simple and more robust

  • Most importantly, the integration between RLlib and the Ray AIR* ecosystem

Our goal for all this work is to make RLlib intuitive, easy to extend, and performant for various research and production use cases.

LinkProduction RL and decision-making with RLlib

RLlib is already used in production by industry leaders in different verticals, such as climate control, manufacturing and logistics, finance, gaming, automobile, robotics, boat design, and many others. Avnish Narayan from the RLlib team will walk through case studies of real industrial applications built on top of popular RLlib algorithms.

Bonus: If you’re new to RL or RLlib, consider attending the RLlib training session, led by Sven Mika. Training will take place on August 22, before the main conference begins. Seats are limited, so register soon!

LinkSee you at the summit!

That’s just the beginning of what we have in store at this year’s Ray Summit. Over the two days of the conference, we’ll learn how companies like ByteDance, Shopify, Uber, IBM, and more are building cutting-edge ML platforms and applications on Ray. We’ll hear keynotes from AI and ML luminaries like Greg Brockman, co-founder of OpenAI, and Soumith Chintala, creator of PyTorch. And, since it’s the first time many community members are getting together in person, we’ll also have some fun during our community happy hour and Ray Summit meetup.

Register now — early bird registration ends on June 30!

*Update Sep, 16, 2023: We are sunsetting the "Ray AIR" concept and namespace starting with Ray 2.7. The changes follow the proposal outlined in this REP.

Ready to try Anyscale?

Access Anyscale today to see how companies using Anyscale and Ray benefit from rapid time-to-market and faster iterations across the entire AI lifecycle.