Sharp, Nicholas
SpaceMesh: A Continuous Representation for Learning Manifold Surface Meshes
Shen, Tianchang, Li, Zhaoshuo, Law, Marc, Atzmon, Matan, Fidler, Sanja, Lucas, James, Gao, Jun, Sharp, Nicholas
Meshes are ubiquitous in visual computing and simulation, yet most existing machine learning techniques represent meshes only indirectly, e.g. as the level set of a scalar field or deformation of a template, or as a disordered triangle soup lacking local structure. This work presents a scheme to directly generate manifold, polygonal meshes of complex connectivity as the output of a neural network. Our key innovation is to define a continuous latent connectivity space at each mesh vertex, which implies the discrete mesh. In particular, our vertex embeddings generate cyclic neighbor relationships in a halfedge mesh representation, which gives a guarantee of edge-manifoldness and the ability to represent general polygonal meshes. This representation is well-suited to machine learning and stochastic optimization, without restriction on connectivity or topology. We first explore the basic properties of this representation, then use it to fit distributions of meshes from large datasets. The resulting models generate diverse meshes with tessellation structure learned from the dataset population, with concise details and high-quality mesh elements. In applications, this approach not only yields high-quality outputs from generative models, but also enables directly learning challenging geometry processing tasks such as mesh repair.
Simplicits: Mesh-Free, Geometry-Agnostic, Elastic Simulation
Modi, Vismay, Sharp, Nicholas, Perel, Or, Sueda, Shinjiro, Levin, David I. W.
The proliferation of 3D representations, from explicit meshes to implicit neural fields and more, motivates the need for simulators agnostic to representation. We present a data-, mesh-, and grid-free solution for elastic simulation for any object in any geometric representation undergoing large, nonlinear deformations. We note that every standard geometric representation can be reduced to an occupancy function queried at any point in space, and we define a simulator atop this common interface. For each object, we fit a small implicit neural network encoding spatially varying weights that act as a reduced deformation basis. These weights are trained to learn physically significant motions in the object via random perturbations. Our loss ensures we find a weight-space basis that best minimizes deformation energy by stochastically evaluating elastic energies through Monte Carlo sampling of the deformation volume. At runtime, we simulate in the reduced basis and sample the deformations back to the original domain. Our experiments demonstrate the versatility, accuracy, and speed of this approach on data including signed distance functions, point clouds, neural primitives, tomography scans, radiance fields, Gaussian splats, surface meshes, and volume meshes, as well as showing a variety of material energies, contact models, and time integration schemes.
TexFusion: Synthesizing 3D Textures with Text-Guided Image Diffusion Models
Cao, Tianshi, Kreis, Karsten, Fidler, Sanja, Sharp, Nicholas, Yin, Kangxue
We present TexFusion (Texture Diffusion), a new method to synthesize textures for given 3D geometries, using large-scale text-guided image diffusion models. In contrast to recent works that leverage 2D text-to-image diffusion models to distill 3D objects using a slow and fragile optimization process, TexFusion introduces a new 3D-consistent generation technique specifically designed for texture synthesis that employs regular diffusion model sampling on different 2D rendered views. Specifically, we leverage latent diffusion models, apply the diffusion model's denoiser on a set of 2D renders of the 3D object, and aggregate the different denoising predictions on a shared latent texture map. Final output RGB textures are produced by optimizing an intermediate neural color field on the decodings of 2D renders of the latent texture. We thoroughly validate TexFusion and show that we can efficiently generate diverse, high quality and globally coherent textures. We achieve state-of-the-art text-guided texture synthesis performance using only image diffusion models, while avoiding the pitfalls of previous distillation-based methods. The text-conditioning offers detailed control and we also do not rely on any ground truth 3D textures for training. This makes our method versatile and applicable to a broad range of geometry and texture types. We hope that TexFusion will advance AI-based texturing of 3D assets for applications in virtual reality, game design, simulation, and more.
Flexible Isosurface Extraction for Gradient-Based Mesh Optimization
Shen, Tianchang, Munkberg, Jacob, Hasselgren, Jon, Yin, Kangxue, Wang, Zian, Chen, Wenzheng, Gojcic, Zan, Fidler, Sanja, Sharp, Nicholas, Gao, Jun
This work considers gradient-based mesh optimization, where we iteratively optimize for a 3D surface mesh by representing it as the isosurface of a scalar field, an increasingly common paradigm in applications including photogrammetry, generative modeling, and inverse physics. Existing implementations adapt classic isosurface extraction algorithms like Marching Cubes or Dual Contouring; these techniques were designed to extract meshes from fixed, known fields, and in the optimization setting they lack the degrees of freedom to represent high-quality feature-preserving meshes, or suffer from numerical instabilities. We introduce FlexiCubes, an isosurface representation specifically designed for optimizing an unknown mesh with respect to geometric, visual, or even physical objectives. Our main insight is to introduce additional carefully-chosen parameters into the representation, which allow local flexible adjustments to the extracted mesh geometry and connectivity. These parameters are updated along with the underlying scalar field via automatic differentiation when optimizing for a downstream task. We base our extraction scheme on Dual Marching Cubes for improved topological properties, and present extensions to optionally generate tetrahedral and hierarchically-adaptive meshes. Extensive experiments validate FlexiCubes on both synthetic benchmarks and real-world applications, showing that it offers significant improvements in mesh quality and geometric fidelity.
ATT3D: Amortized Text-to-3D Object Synthesis
Lorraine, Jonathan, Xie, Kevin, Zeng, Xiaohui, Lin, Chen-Hsuan, Takikawa, Towaki, Sharp, Nicholas, Lin, Tsung-Yi, Liu, Ming-Yu, Fidler, Sanja, Lucas, James
Text-to-3D modelling has seen exciting progress by combining generative text-to-image models with image-to-3D methods like Neural Radiance Fields. DreamFusion recently achieved high-quality results but requires a lengthy, per-prompt optimization to create 3D objects. To address this, we amortize optimization over text prompts by training on many prompts simultaneously with a unified model, instead of separately. With this, we share computation across a prompt set, training in less time than per-prompt optimization. Our framework - Amortized text-to-3D (ATT3D) - enables knowledge-sharing between prompts to generalize to unseen setups and smooth interpolations between text for novel assets and simple animations.
Data-Free Learning of Reduced-Order Kinematics
Sharp, Nicholas, Romero, Cristian, Jacobson, Alec, Vouga, Etienne, Kry, Paul G., Levin, David I. W., Solomon, Justin
Physical systems ranging from elastic bodies to kinematic linkages are defined on high-dimensional configuration spaces, yet their typical low-energy configurations are concentrated on much lower-dimensional subspaces. This work addresses the challenge of identifying such subspaces automatically: given as input an energy function for a high-dimensional system, we produce a low-dimensional map whose image parameterizes a diverse yet low-energy submanifold of configurations. The only additional input needed is a single seed configuration for the system to initialize our procedure; no dataset of trajectories is required. We represent subspaces as neural networks that map a low-dimensional latent vector to the full configuration space, and propose a training scheme to fit network parameters to any system of interest. This formulation is effective across a very general range of physical systems; our experiments demonstrate not only nonlinear and very low-dimensional elastic body and cloth subspaces, but also more general systems like colliding rigid bodies and linkages. We briefly explore applications built on this formulation, including manipulation, latent interpolation, and sampling.
PointTriNet: Learned Triangulation of 3D Point Sets
Sharp, Nicholas, Ovsjanikov, Maks
This work considers a new task in geometric deep learning: generating a triangulation among a set of points in 3D space. We present PointTriNet, a differentiable and scalable approach enabling point set triangulation as a layer in 3D learning pipelines. The method iteratively applies two neural networks: a classification network predicts whether a candidate triangle should appear in the triangulation, while a proposal network suggests additional candidates. Both networks are structured as PointNets over nearby points and triangles, using a novel triangle-relative input encoding. Since these learning problems operate on local geometric data, our method is efficient and scalable, and generalizes to unseen shape categories. Our networks are trained in an unsupervised manner from a collection of shapes represented as point clouds. We demonstrate the effectiveness of this approach for classical meshing tasks, robustness to outliers, and as a component in end-to-end learning systems.