AITopics | novel-view synthesis

Collaborating Authors

novel-view synthesis

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

f95ec3de395b4bce25b39ef6138da871-Supplemental.pdf

Neural Information Processing SystemsFeb-12-2026, 00:20:54 GMT

ners, novel-view synthesis, synthesis, (16 more...)

Neural Information Processing Systems

Country: Asia > Japan > Honshū > Chūbu > Ishikawa Prefecture > Kanazawa (0.04)

Technology: Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

MoAngelo: Motion-Aware Neural Surface Reconstruction for Dynamic Scenes

Ebbed, Mohamed, Lähner, Zorah

arXiv.org Artificial IntelligenceSep-22-2025

Dynamic scene reconstruction from multi-view videos remains a fundamental challenge in computer vision. While recent neural surface reconstruction methods have achieved remarkable results in static 3D reconstruction, extending these approaches with comparable quality for dynamic scenes introduces significant computational and representational challenges. Existing dynamic methods focus on novel-view synthesis, therefore, their extracted meshes tend to be noisy. Even approaches aiming for geometric fidelity often result in too smooth meshes due to the ill-posedness of the problem. We present a novel framework for highly detailed dynamic reconstruction that extends the static 3D reconstruction method NeuralAngelo to work in dynamic settings. To that end, we start with a high-quality template scene reconstruction from the initial frame using NeuralAngelo, and then jointly optimize deformation fields that track the template and refine it based on the temporal sequence. This flexible template allows updating the geometry to include changes that cannot be modeled with the deformation field, for instance occluded parts or the changes in the topology. We show superior reconstruction accuracy in comparison to previous state-of-the-art methods on the ActorsHQ dataset.

artificial intelligence, machine learning, reconstruction, (17 more...)

arXiv.org Artificial Intelligence

2509.15892

Country: North America > United States (0.28)

Genre: Research Report (1.00)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.93)

Add feedback

Neural Reflectance Surfaces (NeRS) Supplementary Materials

Neural Information Processing SystemsAug-19-2025, 00:21:53 GMT

We describe an additional ablation on view-dependence in Sec. Marketplace Cars (MVMC) dataset for fixed camera novel-view synthesis in Tab. 1 and in-the-wild Quantitative evaluation of novel-view synthesis on MVMC using fixed pseudo-ground truth cameras. Given one of the images as a target image, we randomly select one of the remaining images as the initial training image. In this table, we show the performance before ("Approx. All methods improve from the refinement.

artificial intelligence, machine learning, synthesis, (17 more...)

Neural Information Processing Systems

Country: Asia > Japan > Honshū > Chūbu > Ishikawa Prefecture > Kanazawa (0.04)

Technology: Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

NVSPolicy: Adaptive Novel-View Synthesis for Generalizable Language-Conditioned Policy Learning

Shi, Le, Shi, Yifei, Xu, Xin, Liu, Tenglong, Xi, Junhua, Chen, Chengyuan

arXiv.org Artificial IntelligenceMay-16-2025

Recent advances in deep generative models demonstrate unprecedented zero-shot generalization capabilities, offering great potential for robot manipulation in unstructured environments. Given a partial observation of a scene, deep generative models could generate the unseen regions and therefore provide more context, which enhances the capability of robots to generalize across unseen environments. However, due to the visual artifacts in generated images and inefficient integration of multi-modal features in policy learning, this direction remains an open challenge. We introduce NVSPolicy, a generalizable language-conditioned policy learning method that couples an adaptive novel-view synthesis module with a hierarchical policy network. Given an input image, NVSPolicy dynamically selects an informative viewpoint and synthesizes an adaptive novel-view image to enrich the visual context. To mitigate the impact of the imperfect synthesized images, we adopt a cycle-consistent VAE mechanism that disentangles the visual features into the semantic feature and the remaining feature. The two features are then fed into the hierarchical policy network respectively: the semantic feature informs the high-level meta-skill selection, and the remaining feature guides low-level action estimation. Moreover, we propose several practical mechanisms to make the proposed method efficient. Extensive experiments on CALVIN demonstrate the state-of-the-art performance of our method. Specifically, it achieves an average success rate of 90.4\% across all tasks, greatly outperforming the recent methods. Ablation studies confirm the significance of our adaptive novel-view synthesis paradigm. In addition, we evaluate NVSPolicy on a real-world robotic platform to demonstrate its practical applicability.

artificial intelligence, machine learning, natural language, (20 more...)

arXiv.org Artificial Intelligence

2505.10359

Country: Asia > China (0.14)

Genre: Research Report > New Finding (0.46)

Technology:

Information Technology > Artificial Intelligence > Robots (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning > Generative AI (0.45)

Add feedback

SlotLifter: Slot-guided Feature Lifting for Learning Object-centric Radiance Fields

Liu, Yu, Jia, Baoxiong, Chen, Yixin, Huang, Siyuan

arXiv.org Artificial IntelligenceAug-13-2024

Despite the significant progress in object-centric learning methods, learning object-centric representations in the 3D physical world remains a crucial challenge. In this work, we propose SlotLifter, a novel object-centric radiance model addressing scene reconstruction and decomposition jointly via slot-guided feature lifting. Such a design unites object-centric learning representations and image-based rendering methods, offering state-of-the-art performance in scene decomposition and novel-view synthesis on four challenging synthetic and four complex real-world datasets, outperforming existing 3D object-centric learning methods by a large margin.

proceedings, slot-guided feature lifting, slotlifter, (10 more...)

arXiv.org Artificial Intelligence

2408.06697

Country:

Asia > Japan > Honshū > Chūbu > Ishikawa Prefecture > Kanazawa (0.04)
Asia > China > Beijing > Beijing (0.04)

Genre: Research Report > Promising Solution (0.46)

Technology:

Information Technology > Sensing and Signal Processing > Image Processing (1.00)
Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
(2 more...)

Add feedback

ViVid-1-to-3: Novel View Synthesis with Video Diffusion Models

Kwak, Jeong-gi, Dong, Erqun, Jin, Yuhe, Ko, Hanseok, Mahajan, Shweta, Yi, Kwang Moo

arXiv.org Artificial IntelligenceDec-3-2023

Generating novel views of an object from a single image is a challenging task. It requires an understanding of the underlying 3D structure of the object from an image and rendering high-quality, spatially consistent new views. While recent methods for view synthesis based on diffusion have shown great progress, achieving consistency among various view estimates and at the same time abiding by the desired camera pose remains a critical problem yet to be solved. In this work, we demonstrate a strikingly simple method, where we utilize a pre-trained video diffusion model to solve this problem. Our key idea is that synthesizing a novel view could be reformulated as synthesizing a video of a camera going around the object of interest -- a scanning video -- which then allows us to leverage the powerful priors that a video diffusion model would have learned. Thus, to perform novel-view synthesis, we create a smooth camera trajectory to the target view that we wish to render, and denoise using both a view-conditioned diffusion model and a video diffusion model. By doing so, we obtain a highly consistent novel view synthesis, outperforming the state of the art.

diffusion model, synthesis, video, (12 more...)

arXiv.org Artificial Intelligence

2312.01305

Country:

North America > Canada > British Columbia (0.04)
North America > Canada > Quebec > Montreal (0.04)
Europe > Italy > Calabria > Catanzaro Province > Catanzaro (0.04)
(2 more...)

Genre: Research Report (0.64)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.47)

Add feedback

ORCa: Glossy Objects as Radiance Field Cameras

Tiwary, Kushagra, Dave, Akshat, Behari, Nikhil, Klinghoffer, Tzofi, Veeraraghavan, Ashok, Raskar, Ramesh

arXiv.org Artificial IntelligenceDec-12-2022

Reflections on glossy objects contain valuable and hidden information about the surrounding environment. By converting these objects into cameras, we can unlock exciting applications, including imaging beyond the camera's field-of-view and from seemingly impossible vantage points, e.g. from reflections on the human eye. However, this task is challenging because reflections depend jointly on object geometry, material properties, the 3D environment, and the observer viewing direction. Our approach converts glossy objects with unknown geometry into radiance-field cameras to image the world from the object's perspective. Our key insight is to convert the object surface into a virtual sensor that captures cast reflections as a 2D projection of the 5D environment radiance field visible to the object. We show that recovering the environment radiance fields enables depth and radiance estimation from the object to its surroundings in addition to beyond field-of-view novel-view synthesis, i.e. rendering of novel views that are only directly-visible to the glossy object present in the scene, but not the observer. Moreover, using the radiance field we can image around occluders caused by close-by objects in the scene. Our method is trained end-to-end on multi-view images of the object and jointly estimates object geometry, diffuse radiance, and the 5D environment radiance field.

artificial intelligence, radiance, radiance field, (17 more...)

arXiv.org Artificial Intelligence

2212.04531

Country: North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)

Genre: Research Report (0.40)

Technology: Information Technology > Artificial Intelligence > Vision (1.00)

Add feedback