A beginner’s tutorial for working with multi-agent environments, models, and algorithms.
“Hands-on RL with Ray’s RLlib” is a beginners tutorial for working with reinforcement learning (RL) multi-agent environments, models, and algorithms using Ray’s RLlib library. RLlib offers high scalability, a large list of algorithms to choose from (offline, model-based, model-free, etc..), support for TensorFlow and PyTorch, and a unified API for a variety of applications and customizations. This tutorial includes a brief introduction to provide an overview of concepts (e.g. why RL) before proceeding to RLlib models, hyperparameter tuning, debugging, student exercises, Q/A, and more. All code will be provided as .py files in a GitHub repo.
Python programmers who want to get started with reinforcement learning and RLlib
Some Python programming experience
Some familiarity with machine learning
Experience in reinforcement learning and Ray would be helpful, but isn’t required
Experience with TensorFlow or PyTorch would be helpful, but isn’t required
What is reinforcement learning and why RLlib
How to configure and hyperparameter tune RLlib
RLlib debugging best practices
Sven has been working as a machine learning engineer for Anyscale Inc. since early 2020. He is the lead developer of "RLlib", Ray's industry-grade, scalable reinforcement learning (RL) library. His team is currently focusing on better supporting the most promising industry use cases, such as massive-multi-agent algorithms for league-based self-play, working with recommender systems and slate recommendation algos, such as contextual bandits, as well as, integrating with Ray's new datasets library for a better offline RL experience. A continuing effort of his is asserting high levels of stability and test coverage to ensure RLlib's rapid adoption in industry and research and helping to grow its community and contributor base. Before starting at Anyscale, he has been a leading developer of other successful open-source RL library projects, such as "RLgraph" and "TensorForce".