Reinforcement Learning — Hello World
December 5, 2025
This week’s demo showcases my first foray into modern Reinforcement Learning (RL). Teaching a Quadcopter to “learn Gravity” is one of the simplest RL problems I could imagine — an excellent starting point for an RL journey. I quickly found out, however, that I actually have something like 5k to 10k hours of experience in this type of work (reward shaping, curriculum design, hyperparameter tuning, etc). We just used a different set of tools and libraries in 2010 (RL for stable legged locomotion via nonlinear optimization and Hybrid Zero Dynamics).
Goals
Spinup a Reinforcement Learning workspace
Train a (simplified) Quadcopter to “learn Gravity”
Tech Comparisons
(this) RL + PD
PD only
PD + Full Model Compensation
Notable Tech
C++ for physics and control
python for RL environment and training
pybind11 and ONNX for python/C++ bridges
mujoco simulator (C++)
6DOF stabilization
modified version of skydio quadcopter https://github.com/google-deepmind/mujoco_menagerie/tree/main/skydio_x2