Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Vincenzo Petrone

Scaling Population-Based Reinforcement Learning with GPU Accelerated Simulation

Apr 08, 2024
Asad Ali Shahid, Yashraj Narang, Vincenzo Petrone, Enrico Ferrentino, Ankur Handa, Dieter Fox, Marco Pavone, Loris Roveda

In recent years, deep reinforcement learning (RL) has shown its effectiveness in solving complex continuous control tasks like locomotion and dexterous manipulation. However, this comes at the cost of an enormous amount of experience required for training, exacerbated by the sensitivity of learning efficiency and the policy performance to hyperparameter selection, which often requires numerous trials of time-consuming experiments. This work introduces a Population-Based Reinforcement Learning (PBRL) approach that exploits a GPU-accelerated physics simulator to enhance the exploration capabilities of RL by concurrently training multiple policies in parallel. The PBRL framework is applied to three state-of-the-art RL algorithms -- PPO, SAC, and DDPG -- dynamically adjusting hyperparameters based on the performance of learning agents. The experiments are performed on four challenging tasks in Isaac Gym -- Anymal Terrain, Shadow Hand, Humanoid, Franka Nut Pick -- by analyzing the effect of population size and mutation mechanisms for hyperparameters. The results show that PBRL agents achieve superior performance, in terms of cumulative reward, compared to non-evolutionary baseline agents. The trained agents are finally deployed in the real world for a Franka Nut Pick task, demonstrating successful sim-to-real transfer. Code and videos of the learned policies are available on our project website.

* Submitted for publication to IEEE Robotics and Automation Letters (RA-L)

Via

Access Paper or Ask Questions

Time-Optimal Trajectory Planning with Interaction with the Environment

Jul 12, 2022
Vincenzo Petrone, Enrico Ferrentino, Pasquale Chiacchio

Figure 1 for Time-Optimal Trajectory Planning with Interaction with the Environment

Figure 2 for Time-Optimal Trajectory Planning with Interaction with the Environment

Figure 3 for Time-Optimal Trajectory Planning with Interaction with the Environment

Figure 4 for Time-Optimal Trajectory Planning with Interaction with the Environment

Optimal motion planning along prescribed paths can be solved with several techniques, but most of them do not take into account the wrenches exerted by the end-effector when in contact with the environment. When a dynamic model of the environment is not available, no consolidated methodology exists to consider the effect of the interaction. Regardless of the specific performance index to optimize, this article proposes a strategy to include external wrenches in the optimal planning algorithm, considering the task specifications. This procedure is instantiated for minimum-time trajectories and validated on a real robot performing an interaction task under admittance control. The results prove that the inclusion of end-effector wrenches affect the planned trajectory, in fact modifying the manipulator's dynamic capability.

* \copyright 2022 IEEE. Personal use of this material is permitted. Permission from IEEE must be obtained for all other uses, in any current or future media, including reprinting/republishing this material for advertising or promotional purposes, creating new collective works, for resale or redistribution to servers or lists, or reuse of any copyrighted component of this work in other works

Via

Access Paper or Ask Questions