AITopics | Ozay, Mete

Plotting

Ozay, Mete

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Training CNNs With Normalized Kernels

Ozay, Mete (Tohoku University) | Okatani, Takayuki (Tohoku University and RIKEN Center for AIP)

AAAI ConferencesFeb-8-2018

Several methods of normalizing convolution kernels have been proposed in the literature to train convolutional neural networks (CNNs), and have shown some success. However, our understanding of these methods has lagged behind their success in application; there are a lot of open questions, such as why a certain type of kernel normalization is effective and what type of normalization should be employed for each (e.g., higher or lower) layer of a CNN. As the first step towards answering these questions, we propose a framework that enables us to use a variety of kernel normalization methods at any layer of a CNN. A naive integration of kernel normalization with a general optimization method, such as SGD, often entails instability while updating parameters. Thus, existing methods employ ad-hoc procedures to empirically assure convergence. In this study, we pose estimation of convolution kernels under normalization constraints as constraint-free optimization on kernel submanifolds that are identified by the employed constraints. Note that naive application of the established optimization methods for matrix manifolds to the aforementioned problems is not feasible because of the hierarchical nature of CNNs. To this end, we propose an algorithm for optimization on kernel manifolds in CNNs by appropriate scaling of the space of kernels based on structure of CNNs and statistics of data. We theoretically prove that the proposed algorithm has assurance of almost sure convergence to a solution at single minimum. Our experimental results show that the proposed method can successfully train popular CNN models using several different types of kernel normalization methods. Moreover, they show that the proposed method improves classification performance of baseline CNNs, and provides state-of-the-art performance for major image classification benchmarks.

deep learning, kernel, neural network, (21 more...)

AAAI Conferences

Thirty-Second AAAI Conference on Artificial Intelligence

Country: Asia > Japan > Honshū > Tōhoku (0.14)

Genre: Research Report > New Finding (0.86)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.66)

Add feedback

Information Potential Auto-Encoders

Zhang, Yan, Ozay, Mete, Sun, Zhun, Okatani, Takayuki

arXiv.org Machine LearningAug-6-2017

In this paper, we suggest a framework to make use of mutual information as a regularization criterion to train Auto-Encoders (AEs). In the proposed framework, AEs are regularized by minimization of the mutual information between input and encoding variables of AEs during the training phase. In order to estimate the entropy of the encoding variables and the mutual information, we propose a non-parametric method. We also give an information theoretic view of Variational AEs (VAEs), which suggests that VAEs can be considered as parametric methods that estimate entropy. Experimental results show that the proposed non-parametric models have more degree of freedom in terms of representation learning of features drawn from complex distributions such as Mixture of Gaussians, compared to methods which estimate entropy using parametric approaches, such as Variational AEs.

artificial intelligence, mutual information, neural network, (16 more...)

arXiv.org Machine Learning

1706.04635

Country: Asia > Japan > Honshū > Tōhoku (0.14)

Genre: Research Report (0.70)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

Add feedback

Linear Discriminant Generative Adversarial Networks

Sun, Zhun, Ozay, Mete, Okatani, Takayuki

arXiv.org Machine LearningJul-25-2017

We develop a novel method for training of GANs for unsupervised and class conditional generation of images, called Linear Discriminant GAN (LD-GAN). The discriminator of an LD-GAN is trained to maximize the linear separability between distributions of hidden representations of generated and targeted samples, while the generator is updated based on the decision hyper-planes computed by performing LDA over the hidden representations. LD-GAN provides a concrete metric of separation capacity for the discriminator, and we experimentally show that it is possible to stabilize the training of LD-GAN simply by calibrating the update frequencies between generators and discriminators in the unsupervised case, without employment of normalization methods and constraints on weights. In the class conditional generation tasks, the proposed method shows improved training stability together with better generalization performance compared to WGAN that employs an auxiliary classifier.

artificial intelligence, discriminator, neural network, (16 more...)

arXiv.org Machine Learning

1707.07831

Genre: Research Report > Promising Solution (0.34)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Discriminant Analysis (0.62)

Add feedback

Mesh Learning for Classifying Cognitive Processes

Ozay, Mete, Öztekin, Ilke, Öztekin, Uygar, Vural, Fatos T. Yarman

arXiv.org Artificial IntelligenceFeb-5-2015

A relatively recent advance in cognitive neuroscience has been multi-voxel pattern analysis (MVPA), which enables researchers to decode brain states and/or the type of information represented in the brain during a cognitive operation. MVPA methods utilize machine learning algorithms to distinguish among types of information or cognitive states represented in the brain, based on distributed patterns of neural activity. In the current investigation, we propose a new approach for representation of neural data for pattern analysis, namely a Mesh Learning Model. In this approach, at each time instant, a star mesh is formed around each voxel, such that the voxel corresponding to the center node is surrounded by its p-nearest neighbors. The arc weights of each mesh are estimated from the voxel intensity values by least squares method. The estimated arc weights of all the meshes, called Mesh Arc Descriptors (MADs), are then used to train a classifier, such as Neural Networks, k-Nearest Neighbor, Na\"ive Bayes and Support Vector Machines. The proposed Mesh Model was tested on neuroimaging data acquired via functional magnetic resonance imaging (fMRI) during a recognition memory experiment using categorized word lists, employing a previously established experimental paradigm (\"Oztekin & Badre, 2011). Results suggest that the proposed Mesh Learning approach can provide an effective algorithm for pattern analysis of brain activity during cognitive processing.

classifier, neural network, neurology, (20 more...)

arXiv.org Artificial Intelligence

1205.2382

Country:

North America > United States (0.68)
Asia > Middle East > Republic of Türkiye (0.28)

Industry:

Health & Medicine > Therapeutic Area > Neurology (1.00)
Health & Medicine > Health Care Technology (1.00)
Health & Medicine > Diagnostic Medicine > Imaging (1.00)

Technology:

Information Technology > Artificial Intelligence > Cognitive Science (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.69)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Nearest Neighbor Methods (0.69)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Support Vector Machines (0.56)

Add feedback

Discriminative Functional Connectivity Measures for Brain Decoding

Firat, Orhan, Ozay, Mete, Oztekin, Ilke, Vural, Fatos T. Yarman

arXiv.org Artificial IntelligenceMar-6-2014

We propose a statistical learning model for classifying cognitive processes based on distributed patterns of neural activation in the brain, acquired via functional magnetic resonance imaging (fMRI). In the proposed learning method, local meshes are formed around each voxel. The distance between voxels in the mesh is determined by using a functional neighbourhood concept. In order to define the functional neighbourhood, the similarities between the time series recorded for voxels are measured and functional connectivity matrices are constructed. Then, the local mesh for each voxel is formed by including the functionally closest neighbouring voxels in the mesh. The relationship between the voxels within a mesh is estimated by using a linear regression model. These relationship vectors, called Functional Connectivity aware Local Relational Features (FC-LRF) are then used to train a statistical learning machine. The proposed method was tested on a recognition memory experiment, including data pertaining to encoding and retrieval of words belonging to ten different semantic categories. Two popular classifiers, namely k-nearest neighbour (k-nn) and Support Vector Machine (SVM), are trained in order to predict the semantic category of the item being retrieved, based on activation patterns during encoding. The classification performance of the Functional Mesh Learning model, which range in 62%-71% is superior to the classical multi-voxel pattern analysis (MVPA) methods, which range in 40%-48%, for ten semantic categories.

health & medicine, neurology, voxel, (16 more...)

arXiv.org Artificial Intelligence

1402.5684

Industry:

Health & Medicine > Therapeutic Area > Neurology (1.00)
Health & Medicine > Health Care Technology (1.00)
Health & Medicine > Diagnostic Medicine > Imaging (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Support Vector Machines (0.67)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.54)

Add feedback