Despite tremendous success in many fields, machine learning is still notoriously difficult to apply in robotic hardware, especially for dynamically unstable systems such as walking and running robots. Among the challenges are the necessity to avoid hardware-breaking failures, sparseness of rewards and the difficulty in generating training data.
Instead of focusing on the algorithms, we take a different approach: how can the design the natural dynamics of the robot itself, i.e. the dynamics of the mechanical system and low-level control, make learning easier?
We test ideas such as training wheels, i.e. temporary modifications of the mechanical system, and show how this helps reduce manual tuning needed for direct policy learning.
We also formalize these concepts mathematically, and show the connection between reliable learning and robustness in control, and how to quantify robustness for systems prior to specifying high-level control objectives or policies.