A Deep Emulator for Secondary Motion of 3D Characters
CVPR 2021, oral presentation

Mianlun Zheng¹, Yi Zhou², Duygu Ceylan², Jernej Barbič¹

¹ University of Southern California, ²Adobe Research

Abstract

Fast and light-weight methods for animating 3D characters are desirable in various applications such as computer games. We present a learning-based approach to enhance skinning-based animations of 3D characters with vivid secondary motion effects. We design a neural network that encodes each local patch of a character simulation mesh where the edges implicitly encode the internal forces between the neighboring vertices. The network emulates the ordinary differential equations of the character dynamics, predicting new vertex positions from the current accelerations, velocities and positions. Being a local method, our network is independent of the mesh topology and generalizes to arbitrarily shaped 3D character meshes at test time. We further represent per-vertex constraints and material properties such as stiffness, enabling us to easily adjust the dynamics in different parts of the mesh. We evaluate our method on various character meshes and complex motion sequences. Our method can be over 30 times more efficient than ground-truth physically based simulation, and outperforms alternative solutions that provide fast approximations.

1. Training data examples
2. Our Secondary Motion inference pipeline
3. Our results
4. Ablation study and comparisons
5. Limitations
6. Easter Egg

1. Training data examples

We construct our training data by assigning randomized motions to a primitive, e.g. a sphere. All results in the later sections are predicted from the newtork trained on this dataset. We here provide three random motion examples of the sphere dataset. The color coding shows the dynamics.

Sphere (vtx#: 1015; motion: random; frames: 456):

2. Our Secondary Motion inference pipeline

Below is the basic pipeline of our method. Given the skinned animation mesh, we first build a uniform volumetric mesh and then set the constraints. At inference time, our network predics the secondary motion with respect to the input skinned volumetric mesh. Finally, we render the surface mesh interpolated from the predicted volumetric mesh.

3. Our results

Big vegas (Figure 1) (vtx#: 1468):
Running time: 0.012 s/frame on GPU; 0.017 s/frame on CPU.

Big vegas (vtx#: 39684):
Running time: 0.14 s/frame on GPU; 0.89 s/frame on CPU.

3.1 Homogeneous dynamics

Ortiz (vtx#: 1258)

Motion: cross jumps rotation; frames: 122:
Motion: jazz dancing; frames: 326:

Kaya (vtx#: 1417)

Motion: zombie scream; frames: 167:
Motion: dancing running man; frames: 240:

3.2 Non-homogeneous dynamics

As described in Figure 6, we set three different material properties to the character. In the legend at the bottom of the video, the gray area is the constraint, the red area has lower stiffness and the pink area has higher stiffness.

Michelle (Figure 6) (vtx#: 1105; motion: cross jumps; frames: 122):

Michelle (vtx#: 1105; motion: gangnam style; frames: 371):

Big vegas (vtx#: 1468; motion; cross jumps rotation; frames: 122):

3.3 Performance analysis

In Table 1, we show the speed t_ours of our method, as well as that of the ground truth method t_GT and a baseline method t_BL. We adopted the implicit backward Euler approach as ground truth and the faster explicit central differences integration as the baseline. For each method, we record the time to calculate the dynamic mesh but exclude other components such as initialization, rendering and mesh interpolation. Results indicate that our method, ran on GPU (CPU) is around 30(~20) times faster than the implicit integrator and 3(~2) times faster than the explicit integrator, per frame. With vertices number increasing, the performance of our method is rather more competitive.

4. Ablation study and comparions

Big vegas (Figure 7) (vtx#: 1468; motion: hip hop dancing; frames: 283):

5. Limitations

As discussed in the Conclusion Section, if the local geometric detail of a character is significantly different to those seen during training, e.g., the main local structures in the ear regions of the mousey character don't appear in the volumetric mesh of the sphere for training, the quality of our output decreases. One potential avenue for addressing this is to design more general primitives for training, beyond the tetrahedralized sphere. A thorough study on the type of training primitives and motion sequences can be an interesting future direction.

Mousey (vtx#: 2303; motion: swing dancing; frames: 627):

6. Easter egg

15 different animations of five different characters with predicted secondary motion:

Thanks! Stay healthy and happy.

Comments, questions to Jernej Barbič.

Acknowledgments

NSF (IIS-1911224)
Adobe Research
Bosch Research
USC Annerberg Graduate Fellowships to Mianlun Zheng

Disclaimer

Any opinions, findings, and conclusions or recommendations expressed in this material are those of the authors and do not necessarily reflect the views of the National Science Foundation.

Copyright notice

The documents contained in these directories are included by the contributing authors as a means to ensure timely dissemination of scholarly and technical work on a non-commercial basis. Copyright and all rights therein are maintained by the authors or by other copyright holders, notwithstanding that they have offered their works here electronically. It is understood that all persons copying this information will adhere to the terms and constraints invoked by each author's copyright. These works may not be reposted without the explicit permission of the copyright holder.

Unique accesses:

A Deep Emulator for Secondary Motion of 3D Characters CVPR 2021, oral presentation