apparent motion
BuFF: Burst Feature Finder for Light-Constrained 3D Reconstruction
Ravendran, Ahalya, Bryson, Mitch, Dansereau, Donald G.
Robots operating at night using conventional vision cameras face significant challenges in reconstruction due to noise-limited images. Previous work has demonstrated that burst-imaging techniques can be used to partially overcome this issue. In this paper, we develop a novel feature detector that operates directly on image bursts that enhances vision-based reconstruction under extremely low-light conditions. Our approach finds keypoints with well-defined scale and apparent motion within each burst by jointly searching in a multi-scale and multi-motion space. Because we describe these features at a stage where the images have higher signal-to-noise ratio, the detected features are more accurate than the state-of-the-art on conventional noisy images and burst-merged images and exhibit high precision, recall, and matching performance. We show improved feature performance and camera pose estimates and demonstrate improved structure-from-motion performance using our feature detector in challenging light-constrained scenes. Our feature finder provides a significant step towards robots operating in low-light scenarios and applications including night-time operations.
- Information Technology > Artificial Intelligence > Vision (1.00)
- Information Technology > Sensing and Signal Processing > Image Processing (0.94)
- Information Technology > Artificial Intelligence > Robots (0.91)
- Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.74)
Superevents: Towards Native Semantic Segmentation for Event-based Cameras
Low, Weng Fei, Sonthalia, Ankit, Gao, Zhi, van Schaik, André, Ramesh, Bharath
Most successful computer vision models transform low-level features, such as Gabor filter responses, into richer representations of intermediate or mid-level complexity for downstream visual tasks. These mid-level representations have not been explored for event cameras, although it is especially relevant to the visually sparse and often disjoint spatial information in the event stream. By making use of locally consistent intermediate representations, termed as superevents, numerous visual tasks ranging from semantic segmentation, visual tracking, depth estimation shall benefit. In essence, superevents are perceptually consistent local units that delineate parts of an object in a scene. Inspired by recent deep learning architectures, we present a novel method that employs lifetime augmentation for obtaining an event stream representation that is fed to a fully convolutional network to extract superevents. Our qualitative and quantitative experimental results on several sequences of a benchmark dataset highlights the significant potential for event-based downstream applications.
- Asia (0.28)
- North America > United States (0.15)
Optimality and limitations of audio-visual integration for cognitive systems
Boyce, W. Paul, Lindsay, Tony, Zgonnikov, Arkady, Rano, Ignacio, Wong-Lin, KongFatt
Multimodal integration is an important process in perceptual decision-making. In humans, this process has often been shown to be statistically optimal, or near optimal: sensory information is combined in a fashion that minimises the average error in perceptual representation of stimuli. However, sometimes there are costs that come with the optimization, manifesting as illusory percepts. We review audio-visual facilitations and illusions that are products of multisensory integration, and the computational models that account for these phenomena. In particular, the same optimal computational model can lead to illusory percepts, and we suggest that more studies should be needed to detect and mitigate these illusions, as artefacts in artificial cognitive systems. We provide cautionary considerations when designing artificial cognitive systems with the view of avoiding such artefacts. Finally, we suggest avenues of research towards solutions to potential pitfalls in system design. We conclude that detailed understanding of multisensory integration and the mechanisms behind audio-visual illusions can benefit the design of artificial cognitive systems.
- Europe > Netherlands > South Holland > Delft (0.04)
- Oceania > Australia > New South Wales > Sydney (0.04)
- Europe > United Kingdom > Northern Ireland > County Londonderry > Londonderry (0.04)
- Asia > Japan (0.04)
- Overview (1.00)
- Research Report > New Finding (0.67)
- Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (1.00)
- Information Technology > Artificial Intelligence > Cognitive Science (1.00)
- (2 more...)
Neural Analog Diffusion-Enhancement Layer and Spatio-Temporal Grouping in Early Vision
Waxman, Allen M., Seibert, Michael, Cunningham, Robert K., Wu, Jian
A new class of neural network aimed at early visual processing is described; we call it a Neural Analog Diffusion-Enhancement Layer or "NADEL." The network consists of two levels which are coupled through feedfoward and shunted feedback connections. The lower level is a two-dimensional diffusion map which accepts visual features as input, and spreads activity over larger scales as a function of time. The upper layer is periodically fed the activity from the diffusion layer and locates local maxima in it (an extreme form of contrast enhancement) using a network of local comparators. These local maxima are fed back to the diffusion layer using an on-center/off-surround shunting anatomy. The maxima are also available as output of the network. The network dynamics serves to cluster features on multiple scales as a function of time, and can be used in a variety of early visual processing tasks such as: extraction of comers and high curvature points along edge contours, line end detection, gap filling in contours, generation of fixation points, perceptual grouping on multiple scales, correspondence and path impletion in long-range apparent motion, and building 2-D shape representations that are invariant to location, orientation, scale, and small deformation on the visual field.
- North America > United States > New York (0.05)
- North America > United States > Michigan > Washtenaw County > Ann Arbor (0.04)
- North America > United States > Massachusetts > Suffolk County > Boston (0.04)
- (4 more...)
Neural Analog Diffusion-Enhancement Layer and Spatio-Temporal Grouping in Early Vision
Waxman, Allen M., Seibert, Michael, Cunningham, Robert K., Wu, Jian
A new class of neural network aimed at early visual processing is described; we call it a Neural Analog Diffusion-Enhancement Layer or "NADEL." The network consists of two levels which are coupled through feedfoward and shunted feedback connections. The lower level is a two-dimensional diffusion map which accepts visual features as input, and spreads activity over larger scales as a function of time. The upper layer is periodically fed the activity from the diffusion layer and locates local maxima in it (an extreme form of contrast enhancement) using a network of local comparators. These local maxima are fed back to the diffusion layer using an on-center/off-surround shunting anatomy. The maxima are also available as output of the network. The network dynamics serves to cluster features on multiple scales as a function of time, and can be used in a variety of early visual processing tasks such as: extraction of comers and high curvature points along edge contours, line end detection, gap filling in contours, generation of fixation points, perceptual grouping on multiple scales, correspondence and path impletion in long-range apparent motion, and building 2-D shape representations that are invariant to location, orientation, scale, and small deformation on the visual field.
- North America > United States > New York (0.05)
- North America > United States > Michigan > Washtenaw County > Ann Arbor (0.04)
- North America > United States > Massachusetts > Suffolk County > Boston (0.04)
- (4 more...)
Neural Analog Diffusion-Enhancement Layer and Spatio-Temporal Grouping in Early Vision
Waxman, Allen M., Seibert, Michael, Cunningham, Robert K., Wu, Jian
A new class of neural network aimed at early visual processing is described; we call it a Neural Analog Diffusion-Enhancement Layer or "NADEL." The network consists of two levels which are coupled through feedfoward and shunted feedback connections. The lower level is a two-dimensional diffusion map which accepts visual features as input, and spreads activity over larger scales as a function of time. The upper layer is periodically fed the activity from the diffusion layer and locates local maxima in it (an extreme form of contrast enhancement) using a network of local comparators. These local maxima are fed back to the diffusion layer using an on-center/off-surround shunting anatomy. The maxima are also available as output of the network. The network dynamics serves to cluster features on multiple scales as a function of time, and can be used in a variety of early visual processing tasks such as: extraction of comers and high curvature points along edge contours, line end detection, gap filling in contours, generation of fixation points, perceptual grouping on multiple scales, correspondence and path impletion in long-range apparent motion, and building 2-D shape representations that are invariant to location, orientation, scale, and small deformation on the visual field.
- North America > United States > New York (0.05)
- North America > United States > Michigan > Washtenaw County > Ann Arbor (0.04)
- North America > United States > Massachusetts > Suffolk County > Boston (0.04)
- (4 more...)