Goto

Collaborating Authors

Beaver County


Breakthrough AI Technique Enables Real-Time Rendering of Scenes in 3D From 2D Images

#artificialintelligence

To represent a 3D scene from a 2D image, a light field network encodes the 360-degree light field of the 3D scene into a neural network that directly maps each camera ray to the color observed by that ray. The new machine-learning system can generate a 3D scene from an image about 15,000 times faster than other methods. Humans are pretty good at looking at a single two-dimensional image and understanding the full three-dimensional scene that it captures. Artificial intelligence agents are not. Yet a machine that needs to interact with objects in the world -- like a robot designed to harvest crops or assist with surgery -- must be able to infer properties about a 3D scene from observations of the 2D images it's trained on.


Record-breaking camera keeps everything between 3 cm and 1.7 km in focus

#artificialintelligence

In photography, depth of field refers to how much of a three-dimensional space the camera can focus on at once. A shallow depth of field, for example, would keep the subject sharp but blur out much of the foreground and background. Now, researchers at the National Institute of Standards and Technology have taken inspiration from ancient trilobytes to demonstrate a new light field camera with the deepest depth of field ever recorded. Their visual systems were quite complex, including compound eyes, featuring anywhere between tens and thousands of tiny independent units, each with its own cornea, lens and photoreceptor cells. One trilobyte in particular, Dalmanitina socialis, captured the attention of NIST researchers due to its unique compound eye structure.


Machine-learned, light-field camera detects 3D facial expressions – News Medical

#artificialintelligence

The facial expressions in the acquired 3D images were distinguished through machine learning with an average of 85% accuracy – a statistically …


Technique enables real-time rendering of scenes in 3D

#artificialintelligence

Humans are pretty good at looking at a single two-dimensional image and understanding the full three-dimensional scene that it captures. Artificial intelligence agents are not. Yet a machine that needs to interact with objects in the world--like a robot designed to harvest crops or assist with surgery--must be able to infer properties about a 3D scene from observations of the 2D images it's trained on. While scientists have had success using neural networks to infer representations of 3D scenes from images, these machine learning methods aren't fast enough to make them feasible for many real-world applications. A new technique demonstrated by researchers at MIT and elsewhere is able to represent 3D scenes from images about 15,000 times faster than some existing models.


Scene Representation Transformer: Geometry-Free Novel View Synthesis Through Set-Latent Scene Representations

arXiv.org Artificial Intelligence

A classical problem in computer vision is to infer a 3D scene representation from few images that can be used to render novel views at interactive rates. Previous work focuses on reconstructing pre-defined 3D representations, e.g. textured meshes, or implicit representations, e.g. radiance fields, and often requires input images with precise camera poses and long processing times for each novel scene. In this work, we propose the Scene Representation Transformer (SRT), a method which processes posed or unposed RGB images of a new area, infers a "set-latent scene representation", and synthesises novel views, all in a single feed-forward pass. To calculate the scene representation, we propose a generalization of the Vision Transformer to sets of images, enabling global information integration, and hence 3D reasoning. An efficient decoder transformer parameterizes the light field by attending into the scene representation to render novel views. Learning is supervised end-to-end by minimizing a novel-view reconstruction error. We show that this method outperforms recent baselines in terms of PSNR and speed on synthetic datasets, including a new dataset created for the paper. Further, we demonstrate that SRT scales to support interactive visualization and semantic segmentation of real-world outdoor environments using Street View imagery.


Lensless multicore-fiber microendoscope for real-time tailored light field generation with phase encoder neural network (CoreNet)

arXiv.org Artificial Intelligence

The generation of tailored light with multi-core fiber (MCF) lensless microendoscopes is widely used in biomedicine. However, the computer-generated holograms (CGHs) used for such applications are typically generated by iterative algorithms, which demand high computation effort, limiting advanced applications like in vivo optogenetic stimulation and fiber-optic cell manipulation. The random and discrete distribution of the fiber cores induces strong spatial aliasing to the CGHs, hence, an approach that can rapidly generate tailored CGHs for MCFs is highly demanded. We demonstrate a novel phase encoder deep neural network (CoreNet), which can generate accurate tailored CGHs for MCFs at a near video-rate. Simulations show that CoreNet can speed up the computation time by two magnitudes and increase the fidelity of the generated light field compared to the conventional CGH techniques. For the first time, real-time generated tailored CGHs are on-the-fly loaded to the phase-only SLM for dynamic light fields generation through the MCF microendoscope in experiments. This paves the avenue for real-time cell rotation and several further applications that require real-time high-fidelity light delivery in biomedicine.


A Convolutional Neural Network Approach to the Classification of Engineering Models

arXiv.org Artificial Intelligence

This paper presents a deep learning approach for the classification of Engineering (CAD) models using Convolutional Neural Networks (CNNs). Owing to the availability of large annotated datasets and also enough computational power in the form of GPUs, many deep learning-based solutions for object classification have been proposed of late, especially in the domain of images and graphical models. Nevertheless, very few solutions have been proposed for the task of functional classification of CAD models. Hence, for this research, CAD models have been collected from Engineering Shape Benchmark (ESB), National Design Repository (NDR) and augmented with newer models created using a modelling software to form a dataset - 'CADNET'. It is proposed to use a residual network architecture for CADNET, inspired by the popular ResNet. A weighted Light Field Descriptor (LFD) scheme is chosen as the method of feature extraction, and the generated images are fed as inputs to the CNN. The problem of class imbalance in the dataset is addressed using a class weights approach. Experiments have been conducted with other signatures such as geodesic distance etc. using deep networks as well as other network architectures on the CADNET. The LFD-based CNN approach using the proposed network architecture, along with gradient boosting yielded the best classification accuracy on CADNET.


Light Field Networks: Neural Scene Representations with Single-Evaluation Rendering

#artificialintelligence

Inferring representations of 3D scenes from 2D observations is a fundamental problem of computer graphics, computer vision, and artificial intelligence. Emerging 3D-structured neural scene representations are a promising approach to 3D scene understanding. In this work, we propose a novel neural scene representation, Light Field Networks or LFNs, which represent both geometry and appearance of the underlying 3D scene in a 360-degree, four-dimensional light field parameterized via a neural implicit representation. Rendering a ray from an LFN requires only a *single* network evaluation, as opposed to hundreds of evaluations per ray for ray-marching or volumetric based renderers in 3D-structured neural scene representations. In the setting of simple scenes, we leverage meta-learning to learn a prior over LFNs that enables multi-view consistent light field reconstruction from as little as a single image observation.


Light Field Networks: Neural Scene Representations with Single-Evaluation Rendering

arXiv.org Artificial Intelligence

Inferring representations of 3D scenes from 2D observations is a fundamental problem of computer graphics, computer vision, and artificial intelligence. Emerging 3D-structured neural scene representations are a promising approach to 3D scene understanding. In this work, we propose a novel neural scene representation, Light Field Networks or LFNs, which represent both geometry and appearance of the underlying 3D scene in a 360-degree, four-dimensional light field parameterized via a neural implicit representation. Rendering a ray from an LFN requires only a *single* network evaluation, as opposed to hundreds of evaluations per ray for ray-marching or volumetric based renderers in 3D-structured neural scene representations. In the setting of simple scenes, we leverage meta-learning to learn a prior over LFNs that enables multi-view consistent light field reconstruction from as little as a single image observation. This results in dramatic reductions in time and memory complexity, and enables real-time rendering. The cost of storing a 360-degree light field via an LFN is two orders of magnitude lower than conventional methods such as the Lumigraph. Utilizing the analytical differentiability of neural implicit representations and a novel parameterization of light space, we further demonstrate the extraction of sparse depth maps from LFNs.


Microscopes Improved With Artificial Intelligence

#artificialintelligence

To observe the swift neuronal signals in a fish brain, scientists have started to use a technique called light-field microscopy, which makes it possible to image such fast biological processes in 3D. But the images are often lacking in quality, and it takes hours or days for massive amounts of data to be converted into 3D volumes and movies. Now, EMBL scientists have combined artificial intelligence (AI) algorithms with two cutting-edge microscopy techniques - an advance that shortens the time for image processing from days to mere seconds, while ensuring that the resulting images are crisp and accurate. The findings are published in Nature Methods. "Ultimately, we were able to take'the best of both worlds' in this approach," says Nils Wagner, one of the paper's two lead authors and now a PhD student at the Technical University of Munich.