Goto

Collaborating Authors

 illusory contour


A biological vision inspired framework for machine perception of abutting grating illusory contours

arXiv.org Artificial Intelligence

Higher levels of machine intelligence demand alignment with human perception and cognition. Deep neural networks (DNN) dominated machine intelligence have demonstrated exceptional performance across various real-world tasks. Nevertheless, recent evidence suggests that DNNs fail to perceive illusory contours like the abutting grating, a discrepancy that misaligns with human perception patterns. Departing from previous works, we propose a novel deep network called illusory contour perception network (ICPNet) inspired by the circuits of the visual cortex. In ICPNet, a multi-scale feature projection (MFP) module is designed to extract multi-scale representations. To boost the interaction between feedforward and feedback features, a feature interaction attention module (FIAM) is introduced. Moreover, drawing inspiration from the shape bias observed in human perception, an edge detection task conducted via the edge fusion module (EFM) injects shape constraints that guide the network to concentrate on the foreground. We assess our method on the existing AG-MNIST test set and the AG-Fashion-MNIST test sets constructed by this work. Comprehensive experimental results reveal that ICPNet is significantly more sensitive to abutting grating illusory contours than state-of-the-art models, with notable improvements in top-1 accuracy across various subsets. This work is expected to make a step towards human-level intelligence for DNN-based models.


Abutting Grating Illusion: Cognitive Challenge to Neural Network Models

arXiv.org Artificial Intelligence

Deep learning has achieved tremendous success during the past decade, even surpassing human performance in numerous vision tasks[Russakovsky et al., 2015][Dodge and Karam, 2017]. However, it is certainly not the panicillin to all vision tasks that humans can perform. While ANN models can achieve extremely high results on test set drawn from the same distribution of training set, they can easily fail facing with OOD(out-of-distribution) data[Dodge and Karam, 2017]. It has been discovered that neural network performance decreases under different image corruptions, such as noise, blur, brightness change, fog, etc[Dodge and Karam, 2016][Hendrycks and Dietterich, 2019]. On the other hand, humans are extremely robust to different sorts of distortions applied to images[Dodge and Karam, 2017]. An even more extreme case is adversarial attacks, where human-imperceptible perturbations could cause catastrophic failures to well-trained neural network models[Szegedy et al., 2013]. Multiple attack and defense mechanisms[Szegedy et al., 2013][Carlini and Wagner, 2017][Madry et al., 2017][Moosavi-Dezfooli et al., 2016][Papernot et al., 2016] have been proposed in recent years, but the problem still remains unsolved. Moreover, it has been found that the errors made by humans and nerual network models have little correlation with each other[Dodge and Karam, 2017], indicating that current machine visual systems might still have fundamental deficits compared to human visual systems.


First-Order Modeling and Stability Analysis of Illusory Contours

arXiv.org Artificial Intelligence

In System Theory [20], input-output analysis has been a majo r tool for partial or complete identification of black-box systems. In cognitive vision science, t he study of various visual illusions follows exactly the same spirit. Cognitive scientists have designe d numerous intriguing inputs of image signals, so that the distorted or transformed outputs (as re ported by an average human observer) can help reveal some crucial latent properties of the human v ision system (see, e.g., the remarkable works of Adelson [1], Knill and Kersten [14, 16], and Kanizsa [11]). Illusory contours are such a well known class of visual illusions, and the current paper devel ops a mathematical model to characterize, analyze, and simulate generic illusory contours. Our w ork has been closely inspired by many existent modeling works, especially by Sarti, Malladi, and Sethian [24], and Zhu and Chan [30, 31]. Figure 1 shows two examples of illusory contours known as Kanizsa triangle and square [11, 24, 30].


Logic and MRF Circuitry for Labeling Occluding and Thinline Visual Contours

Neural Information Processing Systems

This paper presents representation and logic for labeling contrast edges and ridges in visual scenes in terms of both surface occlusion (border ownership) and thinline objects. In natural scenes, thinline objects include sticksand wires, while in human graphical communication thinlines include connectors, dividers, and other abstract devices. Our analysis is directed at both natural and graphical domains. The basic problem is to formulate the logic of the interactions among local image events, specifically contrast edges, ridges, junctions, and alignment relations, such as to encode the natural constraints among these events in visual scenes. In a sparse heterogeneous Markov Random Field framework, we define a set of interpretation nodes and energy/potential functions among them. The minimum energy configuration found by Loopy Belief Propagation isshown to correspond to preferred human interpretation across a wide range of prototypical examples including important illusory contour figuressuch as the Kanizsa Triangle, as well as more difficult examples. Inpractical terms, the approach delivers correct interpretations of inherently ambiguous hand-drawn box-and-connector diagrams at low computational cost.


Unsupervised Pixel-prediction

Neural Information Processing Systems

When a sensory system constructs a model of the environment from its input, it might need to verify the model's accuracy. One method of verification is multivariate time-series prediction: a good model could predict the near-future activity of its inputs, much as a good scientific theory predicts future data. Such a predicting model would require copious top-down connections to compare the predictions with the input. That feedback could improve the model's performance in two ways: by biasing internal activity toward expected patterns, and by generating specific error signals if the predictions fail. A proof-of-concept model-an event-driven, computationally efficient layered network, incorporating "cortical" features like all-excitatory synapses and local inhibition-was constructed to make near-future predictions of a simple, moving stimulus. After unsupervised learning, the network contained units not only tuned to obvious features of the stimulus like contour orientation and motion, but also to contour discontinuity ("end-stopping") and illusory contours.


Unsupervised Pixel-prediction

Neural Information Processing Systems

When a sensory system constructs a model of the environment from its input, it might need to verify the model's accuracy. One method of verification is multivariate time-series prediction: a good model could predict the near-future activity of its inputs, much as a good scientific theory predicts future data. Such a predicting model would require copious top-down connections to compare the predictions with the input. That feedback could improve the model's performance in two ways: by biasing internal activity toward expected patterns, and by generating specific error signals if the predictions fail. A proof-of-concept model-an event-driven, computationally efficient layered network, incorporating "cortical" features like all-excitatory synapses and local inhibition-was constructed to make near-future predictions of a simple, moving stimulus. After unsupervised learning, the network contained units not only tuned to obvious features of the stimulus like contour orientation and motion, but also to contour discontinuity ("end-stopping") and illusory contours.


Unsupervised Pixel-prediction

Neural Information Processing Systems

When a sensory system constructs a model of the environment from its input, it might need to verify the model's accuracy. One method of verification is multivariate time-series prediction: a good model could predict the near-future activity of its inputs, much data. Such a predictingas a good scientific theory predicts future to comparemodel would require copious top-down connections the input. That feedback could improve thethe predictions with model's performance in two ways: by biasing internal activity toward expected patterns, and by generating specific error signals if the predictions fail. A proof-of-concept model-an event-driven, computationally efficient layered network, incorporating "cortical" features like all-excitatory synapses and local inhibition-was constructed to make near-future predictions of a simple, moving stimulus.