Reinforcement Learning — Hello World

December 5, 2025

This week’s demo showcases my first foray into modern Reinforcement Learning (RL). Teaching a Quadcopter to “learn Gravity” is one of the simplest RL problems I could imagine — an excellent starting point for an RL journey. I quickly found out, however, that I actually have something like 5k to 10k hours of experience in this type of work (reward shaping, curriculum design, hyperparameter tuning, etc). We just used a different set of tools and libraries in 2010 (RL for stable legged locomotion via nonlinear optimization and Hybrid Zero Dynamics).

Goals

Spinup a Reinforcement Learning workspace
Train a (simplified) Quadcopter to “learn Gravity”

Tech Comparisons

(this) RL + PD
PD only
PD + Full Model Compensation

Notable Tech

C++ for physics and control
python for RL environment and training
pybind11 and ONNX for python/C++ bridges
mujoco simulator (C++)
6DOF stabilization
modified version of skydio quadcopter https://github.com/google-deepmind/mujoco_menagerie/tree/main/skydio_x2

Reinforcement Learning — Hello World

December 5, 2025

Goals

Tech Comparisons

Notable Tech

© 2026 matt powell