Download PDFOpen PDF in browserFlow: Deep Reinforcement Learning for Control in SUMO18 pages•Published: June 25, 2018AbstractWe detail the motivation and design decisions underpinning Flow, a computational framework integrating SUMO with the deep reinforcement learning libraries rllab and RLlib, allowing researchers to apply deep reinforcement learning (RL) methods to traffic scenarios, and permitting vehicle and infrastructure control in highly varied traffic envi- ronments. Users of Flow can rapidly design a wide variety of traffic scenarios in SUMO, enabling the development of controllers for autonomous vehicles and intelligent infrastruc- ture across a broad range of settings.Flow facilitates the use of policy optimization algorithms to train controllers that can optimize for highly customizable traffic metrics, such as traffic flow or system-wide average velocity. Training reinforcement learning agents using such methods requires a massive amount of data, thus simulator reliability and scalability were major challenges in the development of Flow. A contribution of this work is a variety of practical techniques for overcoming such challenges with SUMO, including parallelizing policy rollouts, smart exception and collision handling, and leveraging subscriptions to reduce computational overhead. To demonstrate the resulting performance and reliability of Flow, we introduce the canonical single-lane ring road benchmark and briefly discuss prior work regarding that task. We then pose a more complex and challenging multi-lane setting and present a trained controller for a single vehicle that stabilizes the system. Flow is an open-source tool and available online at https://github.com/cathywu/flow. Keyphrases: autonomous vehicles, deep reinforcement learning, learning and adaptive systems, traffic microsimulation, vehicle dynamics In: Evamarie Wießner, Leonhard Lücken, Robert Hilbrich, Yun-Pang Flötteröd, Jakob Erdmann, Laura Bieker-Walz and Michael Behrisch (editors). SUMO 2018- Simulating Autonomous and Intermodal Transport Systems, vol 2, pages 134-151.
|