Goto

Collaborating Authors

 hypercolumn


Enhancing CNNs robustness to occlusions with bioinspired filters for border completion

Coutinho, Catarina P., Merhab, Aneeqa, Petkovic, Janko, Zanchetta, Ferdinando, Fioresi, Rita

arXiv.org Artificial Intelligence

We exploit the mathematical modeling of the visual cortex mechanism for border completion to define custom filters for CNNs. We see a consistent improvement in performance, particularly in accuracy, when our modified LeNet 5 is tested with occluded MNIST images. Keywords: Convolutional Neural Networks Visual Cortex 1 Introduction Visual perception has evolved as a fundamental tool for living organisms to extract information from their surroundings and adapt their behavior. However, encoding visual information presents several challenges. One major issue is occlusion, i.e. an object's outline is partially hidden by an obstacle.


MARTI-4: new model of human brain, considering neocortex and basal ganglia -- learns to play Atari game by reinforcement learning on a single CPU

Pivovarov, Igor, Shumsky, Sergey

arXiv.org Artificial Intelligence

We present Deep Control - new ML architecture of cortico-striatal brain circuits, which use whole cortical column as a structural element, instead of a singe neuron. Based on this architecture, we present MARTI - new model of human brain, considering neocortex and basal ganglia. This model is de-signed to implement expedient behavior and is capable to learn and achieve goals in unknown environments. We introduce a novel surprise feeling mechanism, that significantly improves reinforcement learning process through inner rewards. We use OpenAI Gym environment to demonstrate MARTI learning on a single CPU just in several hours.


Multi-layer Representation Learning for Robust OOD Image Classification

Ballas, Aristotelis, Diou, Christos

arXiv.org Artificial Intelligence

Convolutional Neural Networks have become the norm in image classification. Nevertheless, their difficulty to maintain high accuracy across datasets has become apparent in the past few years. In order to utilize such models in real-world scenarios and applications, they must be able to provide trustworthy predictions on unseen data. In this paper, we argue that extracting features from a CNN's intermediate layers can assist in the model's final prediction. Specifically, we adapt the Hypercolumns method to a ResNet-18 and find a significant increase in the model's accuracy, when evaluating on the NICO dataset.


Improving speech emotion recognition via Transformer-based Predictive Coding through transfer learning

Lian, Zheng, Li, Ya, Tao, Jianhua, Huang, Jian

arXiv.org Machine Learning

Speech emotion recognition is an important aspect of human-computer interaction. Prior works propose various transfer learning approaches to deal with limited samples in speech emotion recognition. However, they require labeled data for the source task, which cost much effort to collect them. To solve this problem, we focus on the unsupervised task, predictive coding. Nearly unlimited data for most domains can be utilized. In this paper, we utilize the multi-layer Transformer model for the predictive coding, followed with transfer learning approaches to share knowledge of the pre-trained predictive model for speech emotion recognition. We conduct experiments on IEMOCAP, and experimental results reveal the advantages of the proposed method. Our method reaches 65.03% in the weighted accuracy, which also outperforms some currently advanced approaches.




An Information-Theoretic Framework for Understanding Saccadic Eye Movements

Lee, Tai Sing, Yu, Stella X.

Neural Information Processing Systems

In this paper, we propose that information maximization can provide aunified framework for understanding saccadic eye movements. Inthis framework, the mutual information among the cortical representations of the retinal image, the priors constructed from our long term visual experience, and a dynamic short-term internal representation constructed from recent saccades provides a map for guiding eye navigation. By directing the eyes to locations ofmaximum complexity in neuronal ensemble responses at each step, the automatic saccadic eye movement system greedily collects information about the external world, while modifying the neural representations in the process. This framework attempts to connect several psychological phenomena, such as pop-out and inhibition of return, to long term visual experience and short term working memory. It also provides an interesting perspective on contextual computation and formation of neural representation in the visual system. 1 Introduction When we look at a painting or a visual scene, our eyes move around rapidly and constantly to look at different parts of the scene.


Spatial Decorrelation in Orientation Tuned Cortical Cells

Dimitrov, Alexander, Cowan, Jack D.

Neural Information Processing Systems

In this paper we propose a model for the lateral connectivity of orientation-selective cells in the visual cortex based on informationtheoretic considerations. We study the properties of the input signal to the visual cortex and find new statistical structures which have not been processed in the retino-geniculate pathway. Applying the idea that the system optimizes the representation of incoming signals, we derive the lateral connectivity that will achieve this for a set of local orientation-selective patches, as well as the complete spatial structure of a layer of such patches. We compare the results with various physiological measurements.


Visual Cortex Circuitry and Orientation Tuning

Mundel, Trevor, Dimitrov, Alexander, Cowan, Jack D.

Neural Information Processing Systems

A simple mathematical model for the large-scale circuitry of primary visual cortex is introduced. It is shown that a basic cortical architecture of recurrent local excitation and lateral inhibition can account quantitatively for such properties as orientation tuning. The model can also account for such local effects as cross-orientation suppression. It is also shown that nonlocal state-dependent coupling between similar orientation patches, when added to the model, can satisfactorily reproduce such effects as non-local iso--orientation suppression, and non-local crossorientation enhancement. Following this an account is given of perceptual phenomena involving object segmentation, such as "popout", and the direct and indirect tilt illusions.


Visual Cortex Circuitry and Orientation Tuning

Mundel, Trevor, Dimitrov, Alexander, Cowan, Jack D.

Neural Information Processing Systems

A simple mathematical model for the large-scale circuitry of primary visual cortex is introduced. It is shown that a basic cortical architecture of recurrent local excitation and lateral inhibition can account quantitatively for such properties as orientation tuning. The model can also account for such local effects as cross-orientation suppression. It is also shown that nonlocal state-dependent coupling between similar orientation patches, when added to the model, can satisfactorily reproduce such effects as non-local iso--orientation suppression, and non-local crossorientation enhancement. Following this an account is given of perceptual phenomena involving object segmentation, such as "popout", and the direct and indirect tilt illusions.