Close your left eye as you look at this screen. Now close your right eye and open your left -- you'll notice that your field of vision shifts depending on which eye you're using. That's because while we see in two dimensions, the images captured by your retinas are combined to provide depth and produce a sense of three-dimensionality. Machine learning models need this same capability so that they can accurately understand image data. NVIDIA researchers have now made this possible by creating a rendering framework called DIB-R -- a differentiable interpolation-based renderer -- that produces 3D objects from 2D images. The researchers will present their model this week at the annual Conference on Neural Information Processing Systems (NeurIPS), in Vancouver.
A method of machine learning has proven capable of turning 2D images into 3D models. Created by researchers at multi-million-dollar GPU manufacturer NVIDIA, the framework shows that it is possible to infer shape, texture, and light from a single image, in a similar way to the workings of the human eye. "Close your left eye as you look at this screen. Now close your right eye and open your left," writes NVIDIA PR specialist Lauren Finkle on the company blog, "you'll notice that your field of vision shifts depending on which eye you're using. That's because while we see in two dimensions, the images captured by your retinas are combined to provide depth and produce a sense of three-dimensionality." Termed a differentiable interpolation-based renderer, or DIB-R, the NVIDIA rendering framework has the potential to aid, and accelerate various areas of 3D design and robotics, rendering 3D models in a matter of seconds.
Nvidia researchers have published a paper describing a rendering framework that can produce 3D objects from 2D images. Not only that, due to the power of machine learning and AI, the tech does a good job of predicting the correct shape, colour, texture and lighting of the real-life 3D objects. The research could have important impacts in machine vision with depth perception, for robotics, self driving cars, and more. The full research paper, typically dryly entitled Learning to Predict 3D Objects with an Interpolation-Based Renderer, is available as a PDF by clicking the link. A new rendering framework called DIB-R, a differentiable interpolation-based renderer, is the main topic of the paper.
When the first instant photo was taken 75 years ago with a Polaroid camera, it was groundbreaking to rapidly capture the 3D world in a realistic 2D image. Today, AI researchers are working on the opposite: turning a collection of still images into a digital 3D scene in a matter of seconds. Known as inverse rendering, the process uses AI to approximate how light behaves in the real world, enabling researchers to reconstruct a 3D scene from a handful of 2D images taken at different angles. The NVIDIA Research team has developed an approach that accomplishes this task almost instantly -- making it one of the first models of its kind to combine ultra-fast neural network training and rapid rendering. NVIDIA applied this approach to a popular new technology called neural radiance fields, or NeRF.