AITopics

doi: 10.1109/TAC.2016.2582642

1411.562

Country: Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.28)

Genre: Research Report (0.40)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Maximum Entropy (1.00)

Taniguchi, Tadahiro, Takano, Toshiaki, Yoshino, Ryo

Multimodal Hierarchical Dirichlet Process-based Active Perception

arXiv.org Machine LearningJan-14-2016

In this paper, we propose an active perception method for recognizing object categories based on the multimodal hierarchical Dirichlet process (MHDP). The MHDP enables a robot to form object categories using multimodal information, e.g., visual, auditory, and haptic information, which can be observed by performing actions on an object. However, performing many actions on a target object requires a long time. In a real-time scenario, i.e., when the time is limited, the robot has to determine the set of actions that is most effective for recognizing a target object. We propose an MHDP-based active perception method that uses the information gain (IG) maximization criterion and lazy greedy algorithm. We show that the IG maximization criterion is optimal in the sense that the criterion is equivalent to a minimization of the expected Kullback--Leibler divergence between a final recognition state and the recognition state after the next set of actions. However, a straightforward calculation of IG is practically impossible. Therefore, we derive an efficient Monte Carlo approximation method for IG by making use of a property of the MHDP. We also show that the IG has submodular and non-decreasing properties as a set function because of the structure of the graphical model of the MHDP. Therefore, the IG maximization problem is reduced to a submodular maximization problem. This means that greedy and lazy greedy algorithms are effective and have a theoretical justification for their performance. We conducted an experiment using an upper-torso humanoid robot and a second one using synthetic data. The experimental results show that the method enables the robot to select a set of actions that allow it to recognize target objects quickly and accurately. The results support our theoretical outcomes.

information, machine learning, object-oriented architecture, (17 more...)

1510.00331

Genre: Research Report > New Finding (0.66)

Industry: Education > Educational Setting (0.93)

Technology:

Information Technology > Artificial Intelligence > Robots (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.94)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty (0.88)
(2 more...)

Giecold, Gregory, Marco, Eugenio, Trippa, Lorenzo, Yuan, Guo-Cheng

Robust Lineage Reconstruction from High-Dimensional Single-Cell Data

Single-cell gene expression data provide invaluable resources for systematic characterization of cellular hierarchy in multi-cellular organisms. However, cell lineage reconstruction is still often associated with significant uncertainty due to technological constraints. Such uncertainties have not been taken into account in current methods. We present ECLAIR, a novel computational method for the statistical inference of cell lineage relationships from single-cell gene expression data. ECLAIR uses an ensemble approach to improve the robustness of lineage predictions, and provides a quantitative estimate of the uncertainty of lineage branchings. We show that the application of ECLAIR to published datasets successfully reconstructs known lineage relationships and significantly improves the robustness of predictions. In conclusion, ECLAIR is a powerful bioinformatics tool for single-cell data analysis. It can be used for robust lineage reconstruction with quantitative estimate of prediction accuracy.

artificial intelligence, eclair, machine learning, (17 more...)

1601.02748

Country: North America > United States (0.47)

Genre: Research Report (0.64)

Industry: Health & Medicine > Pharmaceuticals & Biotechnology (1.00)

Technology:

Information Technology > Data Science (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Clustering (0.94)

Chen, Guangyong, Zhu, Fengyuan, Heng, Pheng Ann

Online Prediction of Dyadic Data with Heterogeneous Matrix Factorization

Dyadic Data Prediction (DDP) is an important problem in many research areas. This paper develops a novel fully Bayesian nonparametric framework which integrates two popular and complementary approaches, discrete mixed membership modeling and continuous latent factor modeling into a unified Heterogeneous Matrix Factorization~(HeMF) model, which can predict the unobserved dyadics accurately. The HeMF can determine the number of communities automatically and exploit the latent linear structure for each bicluster efficiently. We propose a Variational Bayesian method to estimate the parameters and missing data. We further develop a novel online learning approach for Variational inference and use it for the online learning of HeMF, which can efficiently cope with the important large-scale DDP problem. We evaluate the performance of our method on the EachMoive, MovieLens and Netflix Prize collaborative filtering datasets. The experiment shows that, our model outperforms state-of-the-art methods on all benchmarks. Compared with Stochastic Gradient Method (SGD), our online learning approach achieves significant improvement on the estimation accuracy and robustness.

artificial intelligence, bayesian inference, machine learning, (14 more...)

1601.03124

Genre: Research Report > Promising Solution (0.48)

Industry: Media > Film (0.48)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (1.00)

Zhu, Fengyuan, Chen, Guangyong, Hao, Jianye, Heng, Pheng-Ann

Blind Image Denoising via Dependent Dirichlet Process Tree

Most existing image denoising approaches assumed the noise to be homogeneous white Gaussian distributed with known intensity. However, in real noisy images, the noise models are usually unknown beforehand and can be much more complex. This paper addresses this problem and proposes a novel blind image denoising algorithm to recover the clean image from noisy one with the unknown noise model. To model the empirical noise of an image, our method introduces the mixture of Gaussian distribution, which is flexible enough to approximate different continuous distributions. The problem of blind image denoising is reformulated as a learning problem. The procedure is to first build a two-layer structural model for noisy patches and consider the clean ones as latent variable. To control the complexity of the noisy patch model, this work proposes a novel Bayesian nonparametric prior called "Dependent Dirichlet Process Tree" to build the model. Then, this study derives a variational inference algorithm to estimate model parameters and recover clean patches. We apply our method on synthesis and real noisy images with different noise models. Comparing with previous approaches, ours achieves better performance. The experimental results indicate the efficiency of the proposed algorithm to cope with practical image denoising tasks.

artificial intelligence, machine learning, noise, (18 more...)

1601.03117

Country:

North America (0.46)
Asia > China (0.28)

Genre: Research Report (0.50)

Industry:

Consumer Products & Services (0.47)
Education (0.34)

Technology:

Information Technology > Sensing and Signal Processing > Image Processing (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)

Sedghi, Hanie, Janzamin, Majid, Anandkumar, Anima

Provable Tensor Methods for Learning Mixtures of Generalized Linear Models

A generalized linear model (GLM) is a flexible extension of linear regression which allows the response or the output to be a nonlinear function of the input via an activation function. In other words, in a GLM, the linear regression of the input is passed through an activation function to generate the response. GLMs unify popular frameworks such as logistic regression and Poisson regression with linear regression. At the same time, they can be learnt with guarantees using simple iterative methods (Kakade et al., 2011). In many scenarios, however, GLMs may be too simplistic, and mixtures of GLMs can be much more effective since they combine the expressive power of latent variables with the predictive capabilities of the GLM. Mixtures of GLMs have widespread applicability including object recognition (Quattoni et al., 2004), human action recognition (Wang and Mori, 2009), syntactic parsing (Petrov and Klein, 2007), and machine translation (Liang et al., 2006). Traditionally, mixture models are learnt through heuristics such as expectation maximization (EM) (Jordan and Jacobs, 1994; Xu et al., 1995) or variational Bayes (Bishop and Svensen, 2003). However, these methods can converge to spurious local optima and have slow convergence rates for high dimensional models. In contrast, we employ a method-of-moments approach for guaranteed learning of mixtures of GLMs.

artificial intelligence, machine learning, score function, (18 more...)

1412.3046

Country:

North America > United States > California (0.28)
Asia > Middle East > Jordan (0.25)

Genre: Research Report > New Finding (0.34)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.46)

arXiv.org Machine LearningJan-11-2016

How to learn a graph from smooth signals

Kalofolias, Vassilis

We propose a framework that learns the graph structure underlying a set of smooth signals. Given $X\in\mathbb{R}^{m\times n}$ whose rows reside on the vertices of an unknown graph, we learn the edge weights $w\in\mathbb{R}_+^{m(m-1)/2}$ under the smoothness assumption that $\text{tr}{X^\top LX}$ is small. We show that the problem is a weighted $\ell$-1 minimization that leads to naturally sparse solutions. We point out how known graph learning or construction techniques fall within our framework and propose a new model that performs better than the state of the art in many settings. We present efficient, scalable primal-dual based algorithms for both our model and the previous state of the art, and evaluate their performance on artificial and real data.

artificial intelligence, graph, machine learning, (18 more...)

1601.02513

Genre: Research Report (0.64)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.95)

Marti, Gautier, Nielsen, Frank, Donnat, Philippe

Optimal Copula Transport for Clustering Multivariate Time Series

arXiv.org Machine LearningJan-11-2016

Hellebore Capital Management † Ecole Polytechnique ABSTRACT This paper presents a new methodology for clustering multivariate time series leveraging optimal transport between copulas. Copulas are used to encode both (i) intra-dependence of a multivariate time series, and (ii) interdependence between two time series. Then, optimal copula transport allows us to define two distances between multivariate time series: (i) one for measuring intra-dependence dissimilarity, (ii) another one for measuring interdependence dissimilarity based on a new multivariate dependence coefficient which is robust to noise, deterministic, and which can target specified dependencies. Index Terms-- Clustering; Multivariate Time Series; Optimal Transport; Earth Mover's Distance; Empirical Copula; Dependence Coefficient 1. INTRODUCTION Clustering is the task of grouping a set of objects in such a way that objects in the same group, also called cluster, are more similar to each other than those in different groups. This primitive in unsupervised machine learning is known to be hard to formalize and hard to solve.

artificial intelligence, data mining, machine learning, (16 more...)

1509.08144

Country: North America (0.14)

Genre: Research Report (0.50)

Technology:

Information Technology > Data Science > Data Mining (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.68)

Janzamin, Majid, Sedghi, Hanie, Anandkumar, Anima

Beating the Perils of Non-Convexity: Guaranteed Training of Neural Networks using Tensor Methods

arXiv.org Machine LearningJan-11-2016

Training neural networks is a challenging non-convex optimization problem, and backpropagation or gradient descent can get stuck in spurious local optima. We propose a novel algorithm based on tensor decomposition for guaranteed training of two-layer neural networks. We provide risk bounds for our proposed method, with a polynomial sample complexity in the relevant parameters, such as input dimension and number of neurons. While learning arbitrary target functions is NP-hard, we provide transparent conditions on the function and the input for learnability. Our training method is based on tensor decomposition, which provably converges to the global optimum, under a set of mild non-degeneracy conditions. It consists of simple embarrassingly parallel linear and multi-linear operations, and is competitive with standard stochastic gradient descent (SGD), in terms of computational complexity. Thus, we propose a computationally efficient method with guaranteed risk bounds for training neural networks with one hidden layer.

artificial intelligence, machine learning, neural network, (17 more...)

1506.08473

Country: Europe (0.45)

Genre:

Research Report (0.82)
Workflow (0.67)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Gradient Descent (0.74)

Alexandre, Claudio, Balsa, João

Client Profiling for an Anti-Money Laundering System

arXiv.org Artificial IntelligenceJan-11-2016

Acts of prevention and fight against money laundering (ML) crimes are prioritized by almost every government in the world, at the same level of the most relevant global issues. Money laundering is a crime that typically consists in making a certain illegal financial gain into a legal gain. According to the United Nations Office on Drugs and Crimes (UNODC) the annual global estimate of laundered money is about 2% - 5% of the Gross World Product, or US$800 billion - US$2 trillion [1]. As if the financial volume were not enough, another reason for governments to focus on this crime is for the fact that it is clearly connected to other types of crimes such as illegal drug trade, fraud, corruption, kidnapping, terrorism, arms smuggling, among others. Most countries' financial authorities, usually Central Banks, are responsible for controlling and defining antimoney laundering (AML) regulations, demanding from financial institutions the implementation of procedures that apply the defined norms.

artificial intelligence, data mining, machine learning, (18 more...)

arXiv.org Artificial Intelligence

1510.00878

Country:

Europe > Portugal > Lisbon > Lisbon (0.15)
North America > United States > California > San Francisco County > San Francisco (0.14)
North America > United States > New York (0.14)

Genre: Research Report (0.64)

Industry:

Law (1.00)
Banking & Finance (1.00)
Law Enforcement & Public Safety > Fraud (0.84)
Government > Intergovernmental Programs (0.54)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Data Science > Data Mining (0.96)