AITopics

1410.6271

Country:

North America > United States (1.00)
Europe > United Kingdom > England (0.28)

Genre: Research Report (0.82)

Industry: Transportation > Air (0.61)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Search (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Evolutionary Systems (1.00)

Kobak, Dmitry, Brendel, Wieland, Constantinidis, Christos, Feierstein, Claudia E., Kepecs, Adam, Mainen, Zachary F., Romo, Ranulfo, Qi, Xue-Lian, Uchida, Naoshige, Machens, Christian K.

Demixed principal component analysis of population activity in higher cortical areas reveals independent representation of task parameters

Neurons in higher cortical areas, such as the prefrontal cortex, are known to be tuned to a variety of sensory and motor variables. The resulting diversity of neural tuning often obscures the represented information. Here we introduce a novel dimensionality reduction technique, demixed principal component analysis (dPCA), which automatically discovers and highlights the essential features in complex population activities. We reanalyze population data from the prefrontal areas of rats and monkeys performing a variety of working memory and decision-making tasks. In each case, dPCA summarizes the relevant features of the population response in a single figure. The population activity is decomposed into a few demixed components that capture most of the variance in the data and that highlight dynamic tuning of the population to various task parameters, such as stimuli, decisions, rewards, etc. Moreover, dPCA reveals strong, condition-independent components of the population activity that remain unnoticed with conventional approaches.

artificial intelligence, machine learning, neuron, (19 more...)

1410.6031

Country:

North America > United States (0.46)
Europe (0.45)

Genre:

Research Report > Experimental Study (0.67)
Research Report > New Finding (0.46)

Industry: Health & Medicine > Therapeutic Area > Neurology (1.00)

Technology:

Information Technology > Artificial Intelligence > Cognitive Science > Neuroscience (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Principal Component Analysis (0.60)

Sabato, Sivan, Munos, Remi

Active Regression by Stratification

We propose a new active learning algorithm for parametric linear regression with random design. We provide finite sample convergence guarantees for general distributions in the misspecified model. This is the first active learner for this setting that provably can improve over passive learning. Unlike other learning settings (such as classification), in regression the passive learning rate of $O(1/\epsilon)$ cannot in general be improved upon. Nonetheless, the so-called `constant' in the rate of convergence, which is characterized by a distribution-dependent risk, can be improved in many cases. For a given distribution, achieving the optimal risk requires prior knowledge of the distribution. Following the stratification technique advocated in Monte-Carlo function integration, our active learner approaches the optimal risk using piecewise constant approximations.

artificial intelligence, machine learning, regression, (17 more...)

1410.592

Country: Asia > Middle East (0.28)

Genre: Research Report (0.50)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.67)

Alquier, Pierre, Cottet, Vincent, Chopin, Nicolas, Rousseau, Judith

Bayesian matrix completion: prior specification

Low-rank matrix estimation from incomplete measurements recently received increased attention due to the emergence of several challenging applications, such as recommender systems; see in particular the famous Netflix challenge. While the behaviour of algorithms based on nuclear norm minimization is now well understood, an as yet unexplored avenue of research is the behaviour of Bayesian algorithms in this context. In this paper, we briefly review the priors used in the Bayesian literature for matrix completion. A standard approach is to assign an inverse gamma prior to the singular values of a certain singular value decomposition of the matrix of interest; this prior is conjugate. However, we show that two other types of priors (again for the singular values) may be conjugate for this model: a gamma prior, and a discrete prior. Conjugacy is very convenient, as it makes it possible to implement either Gibbs sampling or Variational Bayes. Interestingly enough, the maximum a posteriori for these different priors is related to the nuclear norm minimization problems. We also compare all these priors on simulated datasets, and on the classical MovieLens and Netflix datasets.

artificial intelligence, machine learning, matrix completion, (19 more...)

1406.144

Country: North America > United States (0.28)

Genre: Research Report (0.50)

Industry: Leisure & Entertainment (0.55)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (1.00)

Low-Rank Modeling and Its Applications in Image Analysis

Zhou, Xiaowei, Yang, Can, Zhao, Hongyu, Yu, Weichuan

Low-rank modeling generally refers to a class of methods that solve problems by representing variables of interest as low-rank matrices. It has achieved great success in various fields including computer vision, data mining, signal processing and bioinformatics. Recently, much progress has been made in theories, algorithms and applications of low-rank modeling, such as exact low-rank matrix recovery via convex programming and matrix completion applied to collaborative filtering. These advances have brought more and more attentions to this topic. In this paper, we review the recent advance of low-rank modeling, the state-of-the-art algorithms, and related applications in image analysis. We first give an overview to the concept of low-rank modeling and challenging problems in this area. Then, we summarize the models and algorithms for low-rank matrix recovery and illustrate their advantages and limitations with numerical experiments. Next, we introduce a few applications of low-rank modeling in the context of image analysis. Finally, we conclude this paper with some discussions.

artificial intelligence, bayesian inference, machine learning, (16 more...)

1401.3409

Country: Asia > China (0.28)

Genre: Overview (1.00)

Industry: Health & Medicine > Diagnostic Medicine > Imaging (0.93)

Technology:

Information Technology > Sensing and Signal Processing > Image Processing (1.00)
Information Technology > Data Science (1.00)
Information Technology > Artificial Intelligence > Vision (1.00)
(4 more...)

Miller, Benjamin A., Beard, Michelle S., Wolfe, Patrick J., Bliss, Nadya T.

A Spectral Framework for Anomalous Subgraph Detection

A wide variety of application domains are concerned with data consisting of entities and their relationships or connections, formally represented as graphs. Within these diverse application areas, a common problem of interest is the detection of a subset of entities whose connectivity is anomalous with respect to the rest of the data. While the detection of such anomalous subgraphs has received a substantial amount of attention, no application-agnostic framework exists for analysis of signal detectability in graph-based data. In this paper, we describe a framework that enables such analysis using the principal eigenspace of a graph's residuals matrix, commonly called the modularity matrix in community detection. Leveraging this analytical tool, we show that the framework has a natural power metric in the spectral norm of the anomalous subgraph's adjacency matrix (signal power) and of the background graph's residuals matrix (noise power). We propose several algorithms based on spectral properties of the residuals matrix, with more computationally expensive techniques providing greater detection power. Detection and identification performance are presented for a number of signal and noise models, including clusters and bipartite foregrounds embedded into simple random backgrounds as well as graphs with community structure and realistic degree distributions. The trends observed verify intuition gleaned from other signal processing areas, such as greater detection power when the signal is embedded within a less active portion of the background. We demonstrate the utility of the proposed techniques in detecting small, highly anomalous subgraphs in real graphs derived from Internet traffic and product co-purchases.

data mining, machine learning, subgraph, (19 more...)

doi: 10.1109/TSP.2015.2437841

1401.7702

Country: North America > United States > Massachusetts > Middlesex County (0.28)

Genre: Research Report (0.81)

Industry: Government > Regional Government (0.46)

Technology:

Information Technology > Data Science > Data Mining (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.94)
Information Technology > Communications > Networks (0.93)

Daily Stress Recognition from Mobile Phone Data, Weather Conditions and Individual Traits

Bogomolov, Andrey, Lepri, Bruno, Ferron, Michela, Pianesi, Fabio, Alex, null, Pentland, null

Research has proven that stress reduces quality of life and causes many diseases. For this reason, several researchers devised stress detection systems based on physiological parameters. However, these systems require that obtrusive sensors are continuously carried by the user. In our paper, we propose an alternative approach providing evidence that daily stress can be reliably recognized based on behavioral metrics, derived from the user's mobile phone activity and from additional indicators, such as the weather conditions (data pertaining to transitory properties of the environment) and the personality traits (data concerning permanent dispositions of individuals). Our multifactorial statistical model, which is person-independent, obtains the accuracy score of 72.28% for a 2-class daily stress recognition problem. The model is efficient to implement for most of multimedia applications due to highly reduced low-dimensional feature space (32d). Moreover, we identify and discuss the indicators which have strong predictive power.

artificial intelligence, machine learning, personality trait, (15 more...)

doi: 10.1145/2647868.2654933

1410.5816

Country:

North America > United States (1.00)
Europe (1.00)
Asia (0.67)

Genre:

Research Report > New Finding (1.00)
Research Report > Experimental Study (1.00)

Industry:

Health & Medicine > Consumer Health (1.00)
Health & Medicine > Therapeutic Area > Neurology (0.46)

Technology:

Information Technology > Communications > Mobile (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.46)

Bogomolov, Andrey, Lepri, Bruno, Pianesi, Fabio

Generalized Compression Dictionary Distance as Universal Similarity Measure

ABSTRACT We present a new similarity measure based on information theoretic measures which is superior than Normalized Compression Distance for clustering problems and inherits the useful properties of conditional Kolmogorov complexity. We show that Normalized Compression Dictionary Size and Normalized Compression Dictionary Entropy are computationally more efficient, as the need to perform the compression itself is eliminated. Also they scale linearly with exponential vector size growth and are content independent. We show that normalized compression dictionary distance is compressor independent, if limited to lossless compressors, which gives space for optimizations and implementation speed improvement for real-time and big data applications. The introduced measure is applicable for machine learning tasks of parameter-free unsupervised clustering, supervised learning such as classification and regression, feature selection, and is applicable for big data problems with order of magnitude speed increase.

artificial intelligence, data mining, machine learning, (15 more...)

1410.5792

Country:

Europe (0.46)
North America > United States (0.28)

Genre: Research Report (1.00)

Technology:

Information Technology > Data Science > Data Mining (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Clustering (0.54)

Jain, Prateek, Tewari, Ambuj, Kar, Purushottam

On Iterative Hard Thresholding Methods for High-dimensional M-Estimation

The use of M-estimators in generalized linear regression models in high dimensional settings requires risk minimization with hard $L_0$ constraints. Of the known methods, the class of projected gradient descent (also known as iterative hard thresholding (IHT)) methods is known to offer the fastest and most scalable solutions. However, the current state-of-the-art is only able to analyze these methods in extremely restrictive settings which do not hold in high dimensional statistical models. In this work we bridge this gap by providing the first analysis for IHT-style methods in the high dimensional statistical setting. Our bounds are tight and match known minimax lower bounds. Our results rely on a general analysis framework that enables us to analyze several popular hard thresholding style algorithms (such as HTP, CoSaMP, SP) in the high dimensional regression setting. We also extend our analysis to a large family of "fully corrective methods" that includes two-stage and partial hard-thresholding algorithms. We show that our results hold for the problem of sparse regression, as well as low-rank matrix recovery.

algorithm, artificial intelligence, machine learning, (17 more...)

1410.5137

Country: North America > United States (0.28)

Genre: Research Report > New Finding (0.86)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.54)

Claesen, Marc, De Smet, Frank, Suykens, Johan A. K., De Moor, Bart

A Robust Ensemble Approach to Learn From Positive and Unlabeled Data Using SVM Base Models

We present a novel approach to learn binary classifiers when only positive and unlabeled instances are available (PU learning). This problem is routinely cast as a supervised task with label noise in the negative set. We use an ensemble of SVM models trained on bootstrap resamples of the training data for increased robustness against label noise. The approach can be considered in a bagging framework which provides an intuitive explanation for its mechanics in a semi-supervised setting. We compared our method to state-of-the-art approaches in simulations using multiple public benchmark data sets. The included benchmark comprises three settings with increasing label noise: (i) fully supervised, (ii) PU learning and (iii) PU learning with false positives. Our approach shows a marginal improvement over existing methods in the second setting and a significant improvement in the third. Frank De Smet is a member of the medical management department of the National Alliance of Christian Mutualities. Accepted at Neurocomputing: SI on Advances in Learning with Label Noise 20/10/2014 1. Introduction Training binary classifiers on positive and unlabeled data is referred to as PU learning [31]. The absence of known negative training instances warrants appropriate learning methods. Inaccurate label information can be more problematic than attribute noise [45]. Specialised PU learning approaches are recommended when (i) negative labels cannot be acquired, (ii) the training data contains a large amount of false negatives or (iii) the positive set has many outliers. Practical applications of PU learning typically feature large, imbalanced training sets with a small amount of labeled (positive) and a large amount of unlabeled training instances. The PU learning problem arises in various settings, including web page classification [44], intrusion detection [26] and bioinformatics tasks such as variant prioritization [42], gene prioritization [1, 35] and virtual screening of drug compounds [41]. Though these applications share a common underlying learning problem, the final evaluation criteria may be fundamentally different.

artificial intelligence, inductive learning, machine learning, (18 more...)

doi: 10.1016/j.neucom.2014.10.081

1402.3144

Country:

Europe (0.95)
North America > United States > California (0.46)

Genre:

Personal (1.00)
Research Report > Experimental Study (0.94)
Research Report > New Finding (0.93)

Industry:

Government (0.93)
Health & Medicine > Therapeutic Area > Oncology (0.46)
Education > Focused Education > Special Education (0.44)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Inductive Learning (1.00)