Learning Neural Deformation Representation for 4D Dynamic Shape Generation
2026-05-31 • Computer Vision and Pattern Recognition
Computer Vision and Pattern Recognition
AI summaryⓘ
The authors focus on creating 4D shapes, which are 3D objects that change over time. They improve on previous methods by separating the shape and motion parts in their model, which helps make the object movements smoother and faster to render. Their new method predicts how different parts of the shape move and transform, making it easier for the model to understand the object's structure. They train their model with a special approach that uses features of shape and motion, leading to better results than earlier works. Their experiments show their method works well for creating and controlling dynamic 3D shapes.
4D dynamic shapes3D shape representationoccupancy fieldsneural signed distance fieldsdiffusion modelmotion representationneural deformationskinning weightsrigid transformationsmotion retargeting
Authors
Gyojin Han, Jiwan Hur, Jaehyun Choi, Junmo Kim
Abstract
Recent developments in 3D shape representation opened new possibilities for generating detailed 3D shapes. Despite these advances, there are few studies dealing with the generation of 4D dynamic shapes that have the form of 3D objects deforming over time. To bridge this gap, we focus on generating 4D dynamic shapes with an emphasis on both generation quality and efficiency in this paper. HyperDiffusion, a previous work on 4D generation, proposed a method of directly generating the weight parameters of 4D occupancy fields but suffered from low temporal consistency and slow rendering speed due to motion representation that is not separated from the shape representation of 4D occupancy fields. Therefore, we propose a new neural deformation representation and combine it with conditional neural signed distance fields to design a 4D representation architecture in which the motion latent space is disentangled from the shape latent space. The proposed deformation representation, which works by predicting skinning weights and rigid transformations for multiple parts, also has advantages over the deformation modules of existing 4D representations in understanding the structure of shapes. In addition, we design a training process of a diffusion model that utilizes the shape and motion features that are extracted by our 4D representation as data points. The results of unconditional generation, conditional generation, and motion retargeting experiments demonstrate that our method not only shows better performance than previous works in 4D dynamic shape generation but also has various potential applications.