Debugging instability

Discussion in 'Priority support' started by Ian Danforth, Jan 18, 2019.

  1. I am developing a version of the OpenAI ant model which uses tendons (I will move to muscles in the near future). I am running 1.5 for the time being.

    During training runs that use this model I have encountered each of the following:

    MuJoCo Warning: Nan, Inf or huge value in QPOS at DOF 0. The simulation is unstable. Time = 5.8700
    MuJoCo Warning: Nan, Inf or huge value in QVEL at DOF 0. The simulation is unstable. Time = 31.5900
    MuJoCo Warning: Nan, Inf or huge value in QACC at DOF 0

    I have been unable to reliably replicate this crash making it difficult to debug. I have tried recording and replaying ctrl inputs to the model just prior to the crash but this didn't reproduce the error.

    The model already uses the RK4 solver and I've seen the crash with timesteps down to 0.002. (Though I usually run with a larger timestep).

    I would appreciate help with two things:

    1. What exactly does DOF 0 refer to? How is the DOF numbering mapped to the model?

    2. What is the best way to isolate the immediate cause of the above errors?

    While I could guess and check with various parameters and hope it goes away I'd like to understand where the instability arises to be able to prevent future occurences.

    Thank you for any time you spend on this.
     
  2. Emo Todorov

    Emo Todorov Administrator Staff Member

    DOF 0 is the first DOF of the first joint defined in the model. If the model has a root joint, that would be the X-translation of the root. Note that the software only complains about the first DOF with suspicious values. It is possible that the other DOFs have the same problem.

    RK4 is not always more stable than Euler. If there is damping in the joints, Euler with smaller time step is better.

    It is difficult to isolate the "immediate cause" of instability, because everything is interacting with everything else. Try to reduce time steps, increase armature inertias, avoid very small masses and inertias in the model.

    Also, does this happen in contact behaviors or even without contact? You can disable contacts and gravity and see if the model is stable when you perturb it (and make it fly around).