Analytical MuJoCo MPC

Exact first- and second-order rigid-body dynamics derivatives for MuJoCo — replacing finite differences in the model-predictive-control loop.

Exploratory research, in progress (2026). This page summarizes work on computing exact analytical derivatives of MuJoCo’s dynamics — now including its soft contact dynamics — and using them inside a model-predictive-control loop, instead of MuJoCo’s default finite differences. Measured results are labeled as such; the one remaining projection is called out explicitly.

📄 Technical draft (PDF)Exact Analytical First-Order Derivatives of MuJoCo's Smooth and Contact Dynamics for Model-Based Control.

TL;DR

MuJoCo is the de-facto simulator for robot control, but its dynamics Jacobians are computed by finite differencing — which is both slow (it re-simulates the system many times per derivative) and accuracy-limited. This project derives and implements the exact analytical rigid-body-dynamics (RBD) derivatives of MuJoCo’s own smooth dynamics and drops them into the MPC solver.

Three contributions:

  • First order (FO), smooth dynamics: the missing analytical \(\partial/\partial q\) half of MuJoCo’s smooth dynamics — the position derivatives MuJoCo computes only by finite difference today.
  • Contact dynamics: the first exact analytical derivative through MuJoCo’s soft, elliptic-friction-cone contact solve — covering stuck, slipping, and separated contacts with a single factorization, plus a closed-form contact-Jacobian derivative for the floating base and sphere/capsule feet.
  • Second order (SO): the exact analytical Hessian, enabling true exact-Hessian Newton steps in MPC (rather than Gauss–Newton / iLQG approximations).

Headline measured results: analytical FO matches finite-difference to 1e-8…1e-10 and is up to 11.6× faster than MuJoCo’s mjd_transitionFD at humanoid scale; the contact derivative matches a clean finite difference to machine precision, and a fully analytical closed-loop quadruped MPC (state, control, and cost Jacobians \(A,B,C,D\) — zero finite differences, counter-verified) computes its derivatives 10.6× faster than FD / 8.8× faster than WASP and reaches a 37× lower control cost — the latter isolated to the analytical cost-Jacobians, which fix a real finite-difference corruption of MJPC’s time-scheduled cost gradient; analytical SO is bit-identical to compiled CasADi autodiff while needing zero code generation and zero build RAM (CasADi runs out of memory at humanoid scale).


The quadruped MPC solution

The analytical derivatives drive a whole-body MPC controller on a Unitree B2 quadruped + Z1 arm (a loco-manipulation platform). The MPC plans over a 14-node horizon with a Fatrop OCP solver.

B2 quadruped + Z1 arm tracking a reference via whole-body MPC
Reference tracking — the B2+Z1 follows a commanded whole-body trajectory under the MPC.
B2 quadruped + Z1 arm performing a pulling task via whole-body MPC
Pulling / manipulation — the same MPC executing a contact-rich pull with the arm.



The problem

For MPC, you need the dynamics Jacobians \(\partial f/\partial x\) at every node, every tick. In MuJoCo today:

  • mjd_smooth_vel gives only the velocity half of the smooth dynamics analytically — i.e. \(\partial\,qacc/\partial v\).
  • The position half \(\partial\,qacc/\partial q\) has no analytical path — it is recovered by central finite differences via mjd_transitionFD, which costs \(O(n^2)\) forward simulations and is limited to ~1e-6 accuracy.
  • Contact/constraint derivatives are omitted entirely.

So the position Jacobian — the load-bearing term for MPC stability — is exactly the piece MuJoCo computes least accurately and most slowly.


Old vs. new derivative expressions

MuJoCo’s smooth forward dynamics (contacts disabled), with \(c \equiv qfrc\_bias\) the bias (Coriolis/centrifugal/gravity) force:

\[qacc = M(q)^{-1}\big(\tau - c(q,v)\big)\]

Old (baseline) — what MuJoCo uses today

\[\underbrace{\frac{\partial\, qacc}{\partial v}}_{\text{analytic, velocity half}} \ \text{via}\ \texttt{mjd\_smooth\_vel} \qquad\qquad \underbrace{\frac{\partial\, qacc}{\partial q}}_{\text{finite difference}} \approx \frac{qacc(q+\delta e_i,\,v) - qacc(q-\delta e_i,\,v)}{2\delta}\ \ (\texttt{mjd\_transitionFD})\]

New (this work) — the analytical position half

\[\frac{\partial\, qacc}{\partial q} = -M^{-1}\!\left[\frac{\partial c}{\partial q} + \frac{\partial M}{\partial q}\,qacc\right], \qquad \frac{\partial\, qacc}{\partial v} = -M^{-1}\frac{\partial c}{\partial v}\]

The \(\tfrac{\partial M}{\partial q}\,qacc\) term is not negligible — dropping it is a common and silent source of error. A clean, numerically-robust equivalent reuses the inverse-dynamics map \(\mathrm{id}(q,v,a) = M(q)\,a + c(q,v)\):

\[\boxed{\ \frac{\partial\, qacc}{\partial q} = -M^{-1}\,\frac{\partial\, \mathrm{id}(q,v,a_0)}{\partial q}\bigg|_{a_0 = qacc}\ }\]

which is evaluated in a single \(O(n^2)\) RNEA-style spatial recursion rather than \(O(n^2)\) full forward simulations.

Second order (the moat) — analytical Lagrangian Hessian

For a multiplier \(\lambda\), the four Hessian blocks needed by an exact-Hessian MPC:

\[\lambda^{\top}\nabla^2\tau = \lambda^{\top}\!\left[\ \frac{\partial^2\tau}{\partial q\,\partial q},\ \ \frac{\partial^2\tau}{\partial v\,\partial v},\ \ \frac{\partial^2\tau}{\partial q\,\partial v},\ \ \frac{\partial^2\tau}{\partial a\,\partial q}\ \right]\]

computed as a directional quantity in one adjoint pass — \(O(n^2)\) instead of forming the full \(O(n^3)\) tensor.


Analytical contact derivatives — the soft-cone IFT

Smooth dynamics is only half of a contact-rich step. MuJoCo resolves contacts with a soft, convex, acceleration-level solve, \(qacc = qacc_{s} + M^{-1} J^{\top} f\), where \(qacc_{s}=M^{-1}\,qfrc\_smooth\), \(J=\texttt{efc\_J}\) is the constraint Jacobian, and the contact force \(f\) solves a regularized elliptic friction cone. MuJoCo ships no analytical derivative of this step — only finite difference.

One formula for every contact state. Differentiating the solve’s optimality condition by the implicit function theorem gives, for any state component \(\theta\),

\[\big(M + J^{\top} H J\big)\,\frac{\partial\, qacc}{\partial \theta} = M\frac{\partial\, qacc_{s}}{\partial \theta} - \frac{\partial M}{\partial \theta} Wf + \frac{\partial J^{\top}}{\partial \theta} f - J^{\top} H\Big(\frac{\partial J}{\partial \theta} qacc - \frac{\partial\, aref}{\partial \theta}\Big) + J^{\top}\!\big(f \odot g \odot \tfrac{\partial\, pos}{\partial \theta}\big).\]

The system matrix \(M + J^{\top} H J\) uses the per-contact cost Hessian \(H\): the diagonal stiffness when a contact is stuck, MuJoCo’s dense elliptic cone-Hessian when slipping, and zero when separated. A single Cholesky factorization therefore spans all zones — no special-casing. The last term is the regularization derivative (MuJoCo’s soft stiffness depends on penetration); omitting it is a silent source of error.

Closed-form \(\partial\,\texttt{efc\_J}/\partial q\). The contact Jacobian’s configuration derivative is the material-point kinematic Hessian plus geometry. Three pieces are MuJoCo-specific and were the technical crux: the floating base’s three rotation dofs mutually drag one another’s motion axes; a sphere/capsule on a plane reduces to its center / lower endpoint (with a ½ penetration term on the normal); and a capsule’s friction frame tracks its axis (an extra \(\partial(\text{frame})/\partial q\) on the tangent rows). Validated against finite difference of efc_J to ~1e-11.

Fully analytical MPC (Unitree A1, MJPC iLQG). The backend now computes all four transition blocks analytically — the state Jacobian \(A\), the control Jacobian \(B\), and the cost/sensor Jacobians \(C=\partial\text{residual}/\partial x\), \(D=\partial\text{residual}/\partial u\) — across 100% of a trot (sphere feet, capsule shanks, slipping contacts, joint limits). An always-on counter confirms literally zero finite-difference evaluations enter the result (no mjd_transitionFD, no mj_differentiatePos, no central differences). Same task, same settings (Quadruped Flat, 250 planning steps, single thread):

Derivative backend model-deriv (s) wall-clock (s) cost/step
FD (mjd_transitionFD) 127.1 186.5 0.3117
WASP (approximate) 106.0 167.7 0.3107
Analytical (this work) 12.0 48.3 0.0085

Two results, both measured:

  • Speed. The analytical derivative phase is 10.6× faster than FD and 8.8× faster than WASP (3.9× / 3.5× end-to-end). The gap jumped from the earlier 1.7× because the analytical \(C,D\) remove all \(2n_v\) state-rollouts — FD still re-simulates the system to get the cost Jacobian; the analytical backend does zero forward rollouts.
  • Accuracy → control quality. The analytical run reaches a 37× lower control cost (\(0.0085\) vs. \(0.31\)). This is not a confound: flipping only \(C,D\) from analytical to FD — keeping \(A,B\) analytical — reproduces the FD cost exactly (\(0.3106\)). The reason is that mjd_transitionFD computes \(C,D\) by differencing through a full mj_step, which re-fires MJPC’s time-driven gait-schedule residual and injects \(\sim\!1/\epsilon\) discontinuity noise into the cost gradient (spurious \(\mathcal{O}(10^4)\) entries on the gait rows). The analytical \(C,D\) evaluate the instantaneous residual Jacobian directly and avoid it. So the analytical derivatives don’t merely match FD — for time-scheduled costs they fix a real defect in FD-based MPC.

Per-block accuracy vs. a clean central-difference reference (machine precision): \(A\!=\!6\!\times\!10^{-10}\), \(B\!=\!7\!\times\!10^{-12}\), \(C\!=\!2\!\times\!10^{-10}\), \(D\!=\!9\!\times\!10^{-12}\). The capsule friction-frame singularity (axis \(\parallel\) normal) is handled by a gauge-invariance argument — the tangent-frame derivative cancels in \(\partial\,qacc/\partial q\) — not a finite-difference fallback. Methodology: the cost finding above is isolated (same \(A,B\), only \(C,D\) swapped); the end-to-end walltimes are each backend planning its own trajectory (the honest realistic number), while accuracy and the cost isolation are measured at identical states.


Derivative accuracy

FO accuracy gate (KS0). Analytical \(\partial\,qacc/\partial\{q,v\}\) vs. MuJoCo’s own central-difference qacc. Pre-registered gate: max relative error ≤ 1e-5 on every robot. Result — passes by 3–6 orders of magnitude:

Robot nv Base ∂qacc/∂q (max-rel vs FD) ∂qacc/∂v
panda 9 fixed 6.7e-9 2.6e-8
solo12 18 floating 4.9e-9 2.4e-7
talos 38 floating 6.2e-8 9.0e-8

SO accuracy. Analytical second-order vs. CasADi autodiff and finite difference on B2+Z1 (nv=25) — essentially exact (machine precision against CasADi):

Block vs. CasADi vs. FD
∂τ/∂q 5.9e-16 4.9e-10
∂τ/∂v 1.1e-15 7.4e-10
M (= ∂τ/∂a) 1.1e-15 8.7e-10
∂²τ/∂v∂v 8.1e-14 verified
∂²τ/∂q∂v 9.5e-13 verified
∂²τ/∂a∂q 4.8e-12 verified

Speed

FO speed vs. finite difference (KS1). Wall-clock per derivative evaluation; the gap grows with DoF as expected (exact \(O(n^2)\) RBD vs. \(O(n)\times\) forward-sim FD):

Robot nv FD mjd_transitionFD (µs) Analytical (µs) Speedup
panda 9 88.8 33.7 2.6×
solo12 18 238.9 45.0 5.3×
talos 38 1163.1 100.0 11.6×

Binding-level speed vs. compiled CasADi (same quantities, both compiled to C, B2+Z1, nv=25):

Quantity Analytical (µs) CasADi compiled-C (µs) Speedup
FO (∂τ/∂q, ∂τ/∂v, M) 26.2 49.0 1.9×
SO (4 Hessian blocks) 129.0 150.7 1.2×

…and crucially, the one-time codegen wall: analytical needs 0 generated code, 0 compile time, 0 peak RAM, whereas CasADi SO generates 8.83 MB of C, takes 63 s to compile, and runs out of memory (>32 GB) at humanoid scale.


MPC results

Measured end-to-end (B2+Z1, 14-node Fatrop OCP, CasADi derivatives). Derivatives dominate — ~80% of every control tick is spent in the Jacobian and Hessian callbacks:

Callback Per tick Share
Lagrangian Hessian (nlp_hess_l) ~27 ms ~45%
Constraint Jacobian (nlp_jac_g) ~20 ms ~32%
Fatrop linear solve ~7 ms ~11%
Constraints / gradient / misc ~2 ms ~3%
Total ~62 ms/tick  

Projected with analytical derivatives (component-wise measurements applied to the same loop — projection, not yet a closed-loop measurement): replacing nlp_jac_g (3.7 → 0.36 ms, ~10×) and nlp_hess_l (7.8 → 1.5 ms, ~5×) brings the tick to ~20 ms (~3× faster).

Scaling / buildability. Analytical builds and runs at every scale; CasADi’s SO tensor cannot be built at humanoid scale:

Robot nv Analytical FO (µs) Analytical SO (µs) CasADi SO build
panda 7 4.9 23.5 OK
hyq 18 16.1 69.3 OK (2.7 GB)
b2_z1 25 26.2 129.0 OK (63 s, 1.4 GB)
talos 38 OOM (>32 GB)

How it compares

Method Deriv. speed (quad. MPC) Control cost Accuracy Exact? Contacts?
FD baseline (mjd_transitionFD) baseline (slowest) 0.31 ~1e-6 (FD-limited) no yes (via FD)
WASP (Liang & Rakita, 2025) 1.2× faster than FD 0.31 approximate no yes (via FD)
Analytical (this work) 10.6× faster than FD, 8.8× faster than WASP 0.0085 (37× lower) machine precision yes yes (analytical)

(End-to-end on the Unitree A1 trot, 250 planning steps. The 37× lower cost is isolated to the analytical cost-Jacobians \(C,D\) — see above. End-to-end walltime is 3.9× / 3.5× faster; at humanoid scale the smooth derivative alone reaches 11.6× over FD.)


Models tested

Robot models used in the derivative and MPC benchmarks
Robots spanning fixed-base (panda) to floating-base humanoid (talos) and the B2+Z1 loco-manipulation platform.


Status & next steps

  • FO well-posedness (KS0) — analytical matches FD to 1e-8…1e-10 across nv = 9 → 38.
  • FO speed (KS1) — up to 11.6× over FD at humanoid scale.
  • Contact derivative — exact analytical derivative through MuJoCo’s soft elliptic cone (stuck/slip/separated), with closed-form ∂efc_J/∂q for floating-base sphere/capsule feet; matches a clean FD to machine precision.
  • Fully analytical closed-loop (A, B, C, D) — wired into MJPC, 100% analytic coverage of a quadruped trot with counter-verified zero finite differences; 10.6× faster derivatives than FD (8.8× than WASP) and 37× lower control cost (isolated to the analytical cost-Jacobians, which fix FD’s corruption of the time-scheduled cost gradient).
  • SO correctness — bit-identical to CasADi, validated against FD, with no codegen wall.
  • Capsule side / self-collision & other geom pairs — current contact closed form covers sphere/capsule-endpoint on a plane; cylinder/box and capsule–capsule contacts fall back to FD.
  • Exact-Hessian vs. Gauss–Newton — does the exact Hessian buy faster MPC convergence? (the flagship SO question)

Stack: MuJoCo, rigid-body-dynamics derivatives (RNEA/ABA-style recursions), CasADi (baseline), Fatrop OCP, whole-body MPC, C++/Python.


Exploratory research conducted independently. Numbers above are taken from the project’s pre-registered result logs; the end-to-end analytical MPC tick time is a projection from per-component measurements and is labeled accordingly.