FreeGaussian: Guidance-free Controllable 3D Gaussian Splats with Flow Derivatives

Chen, Qizhi, Qu, Delin, Tang, Yiwen, Song, Haoming, Zhang, Yiting, Wang, Dong, Zhao, Bin, Li, Xuelong

arXiv.org Artificial Intelligence 

Reconstructing controllable Gaussian splats from monocular video is a challenging task due to its inherently insufficient constraints. Widely adopted approaches supervise complex interactions with additional masks and control signal annotations, limiting their real-world applications. In this paper, we propose an annotation guidance-free method, dubbed FreeGaussian, that mathematically derives dynamic Gaussian motion from optical flow and camera motion using novel dynamic Gaussian constraints. By establishing a connection between 2D flows and 3D Gaussian dynamic control, our method enables self-supervised optimization and continuity of dynamic Gaussian motions from flow priors. Furthermore, we introduce a 3D spherical vector controlling scheme, which represents the state with a 3D Gaussian trajectory, thereby eliminating the need for complex 1D control signal calculations and simplifying controllable Gaussian modeling. Quantitative and qualitative evaluations on extensive experiments demonstrate the stateof-the-art visual performance and control capability of our method. Mainstream methods Yu et al. (2023a); Fridovich-Keil et al. (2023) have recently achieved high-quality real-time rendering via 3D Gaussian representation Kerbl et al. (2023b) and extended to scene-level using large-scale annotated datasets (Qu et al., 2024).