AITopics | stereo image pair

Collaborating Authors

stereo image pair

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Export Reviews, Discussions, Author Feedback and Meta-Reviews

Neural Information Processing SystemsFeb-7-2025, 08:19:04 GMT

This paper addresses the problem of generating 3D object proposals given a stereo image pair from an autonomous driving vehicle. The paper proposes a set of features for a 3D cuboid over a point cloud and ground plane derived from the stereo image pair. The features include point cloud density, free space, object height prior, and object height relative to its surroundings. Note that the features are dependant on knowledge of the object class (other "objectness" proposal methods are agnostic to the object class). A structural SVM is trained to predict the "objectness" of the 3D cuboid proposal.

author feedback and meta-review, detection, stereo image pair, (9 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.36)

Add feedback

StereoDiffusion: Training-Free Stereo Image Generation Using Latent Diffusion Models

Wang, Lezhong, Frisvad, Jeppe Revall, Jensen, Mark Bo, Bigdeli, Siavash Arjomand

arXiv.org Artificial IntelligenceJun-2-2024

The demand for stereo images increases as manufacturers launch more XR devices. To meet this demand, we introduce StereoDiffusion, a method that, unlike traditional inpainting pipelines, is trainning free, remarkably straightforward to use, and it seamlessly integrates into the original Stable Diffusion model. Our method modifies the latent variable to provide an end-to-end, lightweight capability for fast generation of stereo image pairs, without the need for fine-tuning model weights or any post-processing of images. Using the original input to generate a left image and estimate a disparity map for it, we generate the latent vector for the right image through Stereo Pixel Shift operations, complemented by Symmetric Pixel Shift Masking Denoise and Self-Attention Layers Modification methods to align the right-side image with the left-side image. Moreover, our proposed method maintains a high standard of image quality throughout the stereo generation process, achieving state-of-the-art scores in various quantitative evaluations.

disparity map, lpip, ssim, (12 more...)

arXiv.org Artificial Intelligence

2403.04965

Country:

Europe > Denmark > Capital Region > Kongens Lyngby (0.04)
Asia (0.04)

Genre: Research Report (0.82)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

Add feedback

Learning to Render Novel Views from Wide-Baseline Stereo Pairs

Du, Yilun, Smith, Cameron, Tewari, Ayush, Sitzmann, Vincent

arXiv.org Artificial IntelligenceApr-17-2023

We introduce a method for novel view synthesis given only a single wide-baseline stereo image pair. In this challenging regime, 3D scene points are regularly observed only once, requiring prior-based reconstruction of scene geometry and appearance. We find that existing approaches to novel view synthesis from sparse observations fail due to recovering incorrect 3D geometry and due to the high cost of differentiable rendering that precludes their scaling to large-scale training. We take a step towards resolving these shortcomings by formulating a multi-view transformer encoder, proposing an efficient, image-space epipolar line sampling scheme to assemble image features for a target ray, and a lightweight cross-attention-based renderer. Our contributions enable training of our method on a large-scale real-world dataset of indoor and outdoor scenes. We demonstrate that our method learns powerful multi-view geometry priors while reducing the rendering time. We conduct extensive comparisons on held-out test scenes across two real-world datasets, significantly outperforming prior work on novel view synthesis from sparse image observations and achieving multi-view-consistent novel view synthesis.

artificial intelligence, machine learning, synthesis, (15 more...)

arXiv.org Artificial Intelligence

2304.08463

Country:

Asia > Japan > Honshū > Chūbu > Ishikawa Prefecture > Kanazawa (0.04)
North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)
Asia > Singapore (0.04)

Genre: Research Report (0.50)

Technology:

Information Technology > Sensing and Signal Processing > Image Processing (1.00)
Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

Unsupervised learning of depth and motion

Konda, Kishore, Memisevic, Roland

arXiv.org Machine LearningDec-16-2013

We present a model for the joint estimation of disparity and motion. The model is based on learning about the interrelations between images from multiple cameras, multiple frames in a video, or the combination of both. We show that learning depth and motion cues, as well as their combinations, from data is possible within a single type of architecture and a single type of learning algorithm, by using biologically inspired "complex cell" like units, which encode correlations between the pixels across image pairs. Our experimental results show that the learning of depth and motion makes it possible to achieve state-of-the-art performance in 3-D activity analysis, and to outperform existing hand-engineered 3-D motion features by a very large margin.

image pair, information, representation, (17 more...)

arXiv.org Machine Learning

1312.3429

Country:

North America > United States > Oregon > Multnomah County > Portland (0.04)
North America > Canada > Quebec > Montreal (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
Europe > Germany > Hesse > Darmstadt Region > Frankfurt (0.04)

Genre: Research Report > New Finding (0.34)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.69)

Add feedback