Reinforcement Learning Workspace for Unitree Robots

Reverse engineered prompt

Build me a reinforcement learning workspace for Unitree robots using MuJoCo. I want to be able to train walking or velocity tracking policies for robots like Go2, G1, A2, R1, H1_2 and H2, then replay the trained policy in simulation to see if it behaves correctly.

Also include motion imitation for G1, where I can take a CSV motion file, convert it into the right training format, train a policy to mimic it, and play it back visually. The workflow should feel simple, train, play, then prepare for real robot deployment.

Please include clear Python scripts for training, playing, and motion conversion, plus C++ deployment code for running an exported policy on a Unitree robot or the included simulator. Save training logs and checkpoints in a predictable place, and export policy.onnx files for deployment.

Add setup and usage docs with example commands, including simulator testing before real robot use. Look up current MuJoCo, mjlab, and Unitree docs online if needed.

Want more depth? Deep Reverse

unitreerobotics/unitree_rl_mjlab — reverse-engineered prompt

Reverse engineered prompt