Analytical MuJoCo MPC
Exact first- and second-order rigid-body dynamics derivatives for MuJoCo — replacing finite differences in the model-predictive-control loop.
Exploratory research, in progress (2026). This page summarizes work on computing exact analytical derivatives of MuJoCo’s dynamics and using them inside a whole-body MPC loop, instead of MuJoCo’s default finite differences. Measured results are labeled as such; one end-to-end MPC number is a projection and is called out explicitly.
TL;DR
MuJoCo is the de-facto simulator for robot control, but its dynamics Jacobians are computed by finite differencing — which is both slow (it re-simulates the system many times per derivative) and accuracy-limited. This project derives and implements the exact analytical rigid-body-dynamics (RBD) derivatives of MuJoCo’s own smooth dynamics and drops them into the MPC solver.
Two contributions:
- First order (FO): the missing analytical \(\partial/\partial q\) half of MuJoCo’s dynamics — the position derivatives MuJoCo computes only by finite difference today.
- Second order (SO): the exact analytical Hessian, enabling true exact-Hessian Newton steps in MPC (rather than Gauss–Newton / iLQG approximations).
Headline measured results: analytical FO matches finite-difference to 1e-8…1e-10 (3–6 orders inside the validation gate) and is up to 11.6× faster than MuJoCo’s mjd_transitionFD at humanoid scale; analytical SO is bit-identical to compiled CasADi autodiff while needing zero code generation and zero build RAM (CasADi runs out of memory at humanoid scale).
The quadruped MPC solution
The analytical derivatives drive a whole-body MPC controller on a Unitree B2 quadruped + Z1 arm (a loco-manipulation platform). The MPC plans over a 14-node horizon with a Fatrop OCP solver.
The problem
For MPC, you need the dynamics Jacobians \(\partial f/\partial x\) at every node, every tick. In MuJoCo today:
-
mjd_smooth_velgives only the velocity half of the smooth dynamics analytically — i.e. \(\partial\,qacc/\partial v\). - The position half \(\partial\,qacc/\partial q\) has no analytical path — it is recovered by central finite differences via
mjd_transitionFD, which costs \(O(n^2)\) forward simulations and is limited to ~1e-6 accuracy. - Contact/constraint derivatives are omitted entirely.
So the position Jacobian — the load-bearing term for MPC stability — is exactly the piece MuJoCo computes least accurately and most slowly.
Old vs. new derivative expressions
MuJoCo’s smooth forward dynamics (contacts disabled), with \(c \equiv qfrc\_bias\) the bias (Coriolis/centrifugal/gravity) force:
\[qacc = M(q)^{-1}\big(\tau - c(q,v)\big)\]Old (baseline) — what MuJoCo uses today
\[\underbrace{\frac{\partial\, qacc}{\partial v}}_{\text{analytic, velocity half}} \ \text{via}\ \texttt{mjd\_smooth\_vel} \qquad\qquad \underbrace{\frac{\partial\, qacc}{\partial q}}_{\text{finite difference}} \approx \frac{qacc(q+\delta e_i,\,v) - qacc(q-\delta e_i,\,v)}{2\delta}\ \ (\texttt{mjd\_transitionFD})\]New (this work) — the analytical position half
\[\frac{\partial\, qacc}{\partial q} = -M^{-1}\!\left[\frac{\partial c}{\partial q} + \frac{\partial M}{\partial q}\,qacc\right], \qquad \frac{\partial\, qacc}{\partial v} = -M^{-1}\frac{\partial c}{\partial v}\]The \(\tfrac{\partial M}{\partial q}\,qacc\) term is not negligible — dropping it is a common and silent source of error. A clean, numerically-robust equivalent reuses the inverse-dynamics map \(\mathrm{id}(q,v,a) = M(q)\,a + c(q,v)\):
\[\boxed{\ \frac{\partial\, qacc}{\partial q} = -M^{-1}\,\frac{\partial\, \mathrm{id}(q,v,a_0)}{\partial q}\bigg|_{a_0 = qacc}\ }\]which is evaluated in a single \(O(n^2)\) RNEA-style spatial recursion rather than \(O(n^2)\) full forward simulations.
Second order (the moat) — analytical Lagrangian Hessian
For a multiplier \(\lambda\), the four Hessian blocks needed by an exact-Hessian MPC:
\[\lambda^{\top}\nabla^2\tau = \lambda^{\top}\!\left[\ \frac{\partial^2\tau}{\partial q\,\partial q},\ \ \frac{\partial^2\tau}{\partial v\,\partial v},\ \ \frac{\partial^2\tau}{\partial q\,\partial v},\ \ \frac{\partial^2\tau}{\partial a\,\partial q}\ \right]\]computed as a directional quantity in one adjoint pass — \(O(n^2)\) instead of forming the full \(O(n^3)\) tensor.
Derivative accuracy
FO accuracy gate (KS0). Analytical \(\partial\,qacc/\partial\{q,v\}\) vs. MuJoCo’s own central-difference qacc. Pre-registered gate: max relative error ≤ 1e-5 on every robot. Result — passes by 3–6 orders of magnitude:
| Robot | nv | Base | ∂qacc/∂q (max-rel vs FD) | ∂qacc/∂v |
|---|---|---|---|---|
| panda | 9 | fixed | 6.7e-9 | 2.6e-8 |
| solo12 | 18 | floating | 4.9e-9 | 2.4e-7 |
| talos | 38 | floating | 6.2e-8 | 9.0e-8 |
SO accuracy. Analytical second-order vs. CasADi autodiff and finite difference on B2+Z1 (nv=25) — essentially exact (machine precision against CasADi):
| Block | vs. CasADi | vs. FD |
|---|---|---|
| ∂τ/∂q | 5.9e-16 | 4.9e-10 |
| ∂τ/∂v | 1.1e-15 | 7.4e-10 |
| M (= ∂τ/∂a) | 1.1e-15 | 8.7e-10 |
| ∂²τ/∂v∂v | 8.1e-14 | verified |
| ∂²τ/∂q∂v | 9.5e-13 | verified |
| ∂²τ/∂a∂q | 4.8e-12 | verified |
Speed
FO speed vs. finite difference (KS1). Wall-clock per derivative evaluation; the gap grows with DoF as expected (exact \(O(n^2)\) RBD vs. \(O(n)\times\) forward-sim FD):
| Robot | nv | FD mjd_transitionFD (µs) |
Analytical (µs) | Speedup |
|---|---|---|---|---|
| panda | 9 | 88.8 | 33.7 | 2.6× |
| solo12 | 18 | 238.9 | 45.0 | 5.3× |
| talos | 38 | 1163.1 | 100.0 | 11.6× |
Binding-level speed vs. compiled CasADi (same quantities, both compiled to C, B2+Z1, nv=25):
| Quantity | Analytical (µs) | CasADi compiled-C (µs) | Speedup |
|---|---|---|---|
| FO (∂τ/∂q, ∂τ/∂v, M) | 26.2 | 49.0 | 1.9× |
| SO (4 Hessian blocks) | 129.0 | 150.7 | 1.2× |
…and crucially, the one-time codegen wall: analytical needs 0 generated code, 0 compile time, 0 peak RAM, whereas CasADi SO generates 8.83 MB of C, takes 63 s to compile, and runs out of memory (>32 GB) at humanoid scale.
MPC results
Measured end-to-end (B2+Z1, 14-node Fatrop OCP, CasADi derivatives). Derivatives dominate — ~80% of every control tick is spent in the Jacobian and Hessian callbacks:
| Callback | Per tick | Share |
|---|---|---|
Lagrangian Hessian (nlp_hess_l) |
~27 ms | ~45% |
Constraint Jacobian (nlp_jac_g) |
~20 ms | ~32% |
| Fatrop linear solve | ~7 ms | ~11% |
| Constraints / gradient / misc | ~2 ms | ~3% |
| Total | ~62 ms/tick |
Projected with analytical derivatives (component-wise measurements applied to the same loop — projection, not yet a closed-loop measurement): replacing nlp_jac_g (3.7 → 0.36 ms, ~10×) and nlp_hess_l (7.8 → 1.5 ms, ~5×) brings the tick to ~20 ms (~3× faster).
Scaling / buildability. Analytical builds and runs at every scale; CasADi’s SO tensor cannot be built at humanoid scale:
| Robot | nv | Analytical FO (µs) | Analytical SO (µs) | CasADi SO build |
|---|---|---|---|---|
| panda | 7 | 4.9 | 23.5 | OK |
| hyq | 18 | 16.1 | 69.3 | OK (2.7 GB) |
| b2_z1 | 25 | 26.2 | 129.0 | OK (63 s, 1.4 GB) |
| talos | 38 | — | — | OOM (>32 GB) |
How it compares
| Method | Speed | Accuracy | Exact? | Uses MuJoCo’s own model? |
|---|---|---|---|---|
FD baseline (mjd_transitionFD) |
slow — \(O(n^2)\) sims | ~1e-6 (FD-limited) | no | yes |
| WASP (Liang & Rakita, 2025) | ~2× over FD | approximate | no | yes |
| Analytical (this work) | up to 11.6× over FD | 1e-8 – 1e-10 | yes | yes |
Models tested
Status & next steps
- ✅ FO well-posedness (KS0) — analytical matches FD to 1e-8…1e-10 across nv = 9 → 38.
- ✅ FO speed (KS1) — up to 11.6× over FD at humanoid scale.
- ✅ SO correctness — bit-identical to CasADi, validated against FD, with no codegen wall.
- ⏳ Closed-loop accuracy vs. WASP — analytical (exact) vs. WASP (approximate) inside the live MPC.
- ⏳ Exact-Hessian vs. Gauss–Newton — does the exact Hessian buy faster MPC convergence? (the flagship SO question)
Stack: MuJoCo, rigid-body-dynamics derivatives (RNEA/ABA-style recursions), CasADi (baseline), Fatrop OCP, whole-body MPC, C++/Python.
Exploratory research conducted independently. Numbers above are taken from the project’s pre-registered result logs; the end-to-end analytical MPC tick time is a projection from per-component measurements and is labeled accordingly.