AITopics

We consider the problem of learning a vector-valued function f in an online learning setting. The function f is assumed to lie in a reproducing Hilbert space of operator-valued kernels. We describe two online algorithms for learning f while taking into account the output structure. A first contribution is an algorithm, ONORMA, that extends the standard kernel-based online learning algorithm NORMA from scalar-valued to operator-valued setting. We report a cumulative error bound that holds both for classification and regression. We then define a second algorithm, MONORMA, which addresses the limitation of pre-defining the output structure in ONORMA by learning sequentially a linear combination of operator-valued kernels. Our experiments show that the proposed algorithms achieve good performance results with low computational cost.

algorithm, artificial intelligence, machine learning, (14 more...)

1311.0222

Genre: Research Report (0.82)

Industry: Education > Educational Setting > Online (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.93)
Information Technology > Enterprise Applications > Human Resources > Learning Management (0.85)

Monnig, Nathan D., Fornberg, Bengt, Meyer, Francois G.

Inverting Nonlinear Dimensionality Reduction with Scale-Free Radial Basis Function Interpolation

Nonlinear dimensionality reduction embeddings computed from datasets do not provide a mechanism to compute the inverse map. In this paper, we address the problem of computing a stable inverse map to such a general bi-Lipschitz map. Our approach relies on radial basis functions (RBFs) to interpolate the inverse map everywhere on the low-dimensional image of the forward map. We demonstrate that the scale-free cubic RBF kernel performs better than the Gaussian kernel: it does not suffer from ill-conditioning, and does not require the choice of a scale. The proposed construction is shown to be similar to the Nystr\"om extension of the eigenvectors of the symmetric normalized graph Laplacian matrix. Based on this observation, we provide a new interpretation of the Nystr\"om extension with suggestions for improvement.

artificial intelligence, extension, machine learning, (16 more...)

1305.0258

Country: North America > United States > Colorado (0.14)

Genre: Research Report (0.82)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Dimensionality Reduction (0.61)

Georgiev, Stoyan, Mukherjee, Sayan

Randomized Dimension Reduction on Massive Data

Scalability of statistical estimators is of increasing importance in modern applications and dimension reduction is often used to extract relevant information from data. A variety of popular dimension reduction approaches can be framed as symmetric generalized eigendecomposition problems. In this paper we outline how taking into account the low rank structure assumption implicit in these dimension reduction approaches provides both computational and statistical advantages. We adapt recent randomized low-rank approximation algorithms to provide efficient solutions to three dimension reduction methods: Principal Component Analysis (PCA), Sliced Inverse Regression (SIR), and Localized Sliced Inverse Regression (LSIR). A key observation in this paper is that randomization serves a dual role, improving both computational and statistical performance. This point is highlighted in our experiments on real and simulated data.

algorithm, artificial intelligence, machine learning, (15 more...)

1211.1642

Country: North America > United States > California > Santa Clara County (0.28)

Genre: Research Report > New Finding (0.34)

Industry: Health & Medicine > Pharmaceuticals & Biotechnology (0.94)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning in High Dimensional Spaces (1.00)

Amini, Arash A., Chen, Aiyou, Bickel, Peter J., Levina, Elizaveta

Pseudo-likelihood methods for community detection in large sparse networks

Many algorithms have been proposed for fitting network models with communities, but most of them do not scale well to large networks, and often fail on sparse networks. Here we propose a new fast pseudo-likelihood method for fitting the stochastic block model for networks, as well as a variant that allows for an arbitrary degree distribution by conditioning on degrees. We show that the algorithms perform well under a range of settings, including on very sparse networks, and illustrate on the example of a network of political blogs. We also propose spectral clustering with perturbations, a method of independent interest, which works well on sparse networks where regular spectral clustering fails, and use it to provide an initial value for pseudo-likelihood. We prove that pseudo-likelihood provides consistent estimates of the communities under a mild condition on the starting value, for the case of a block model with two communities.

block model, data mining, machine learning, (20 more...)

doi: 10.1214/13-AOS1138

1207.234

Country:

North America > United States > California (0.46)
North America > United States > Michigan (0.28)

Genre: Research Report (0.82)

Industry: Government (0.68)

Technology:

Information Technology > Data Science > Data Mining (1.00)
Information Technology > Communications > Networks (0.68)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Clustering (0.67)
(2 more...)

Jain, Ashesh, Wojcik, Brian, Joachims, Thorsten, Saxena, Ashutosh

Learning Trajectory Preferences for Manipulators via Iterative Improvement

arXiv.org Artificial IntelligenceNov-5-2013

We consider the problem of learning good trajectories for manipulation tasks. This is challenging because the criterion defining a good trajectory varies with users, tasks and environments. In this paper, we propose a co-active online learning framework for teaching robots the preferences of its users for object manipulation tasks. The key novelty of our approach lies in the type of feedback expected from the user: the human user does not need to demonstrate optimal trajectories as training data, but merely needs to iteratively provide trajectories that slightly improve over the trajectory currently proposed by the system. We argue that this co-active preference feedback can be more easily elicited from the user than demonstrations of optimal trajectories, which are often challenging and non-intuitive to provide on high degrees of freedom manipulators. Nevertheless, theoretical regret bounds of our algorithm match the asymptotic rates of optimal trajectory algorithms. We demonstrate the generalizability of our algorithm on a variety of grocery checkout tasks, for whom, the preferences were not only influenced by the object being manipulated but also by the surrounding environment.\footnote{For more details and a demonstration video, visit: \url{http://pr.cs.cornell.edu/coactive}}

artificial intelligence, machine learning, trajectory, (17 more...)

arXiv.org Artificial Intelligence

1306.6294

Genre: Research Report > New Finding (0.68)

Industry: Education (0.34)

Technology:

Information Technology > Artificial Intelligence > Robots (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.93)

arXiv.org Machine LearningNov-4-2013

Stochastic Dual Coordinate Ascent with Alternating Direction Multiplier Method

Suzuki, Taiji

We propose a new stochastic dual coordinate ascent technique that can be applied to a wide range of regularized learning problems. Our method is based on Alternating Direction Multiplier Method (ADMM) to deal with complex regularization functions such as structured regularizations. Although the original ADMM is a batch method, the proposed method offers a stochastic update rule where each iteration requires only one or few sample observations. Moreover, our method can naturally afford mini-batch update and it gives speed up of convergence. We show that, under mild assumptions, our method converges exponentially. The numerical experiments show that our method actually performs efficiently.

artificial intelligence, machine learning, optimization problem, (12 more...)

1311.0622

Country: Asia > Japan (0.14)

Genre: Research Report (0.64)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.95)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.69)

Chen, Jie, Richard, Cédric, Sayed, Ali. H.

Multitask Diffusion Adaptation over Networks

arXiv.org Artificial IntelligenceNov-2-2013

Adaptive networks are suitable for decentralized inference tasks, e.g., to monitor complex natural phenomena. Recent research works have intensively studied distributed optimization problems in the case where the nodes have to estimate a single optimum parameter vector collaboratively. However, there are many important applications that are multitask-oriented in the sense that there are multiple optimum parameter vectors to be inferred simultaneously, in a collaborative manner, over the area covered by the network. In this paper, we employ diffusion strategies to develop distributed algorithms that address multitask problems by minimizing an appropriate mean-square error criterion with $\ell_2$-regularization. The stability and convergence of the algorithm in the mean and in the mean-square sense is analyzed. Simulations are conducted to verify the theoretical findings, and to illustrate how the distributed strategy can be used in several useful applications related to spectral sensing, target localization, and hyperspectral data unmixing.

artificial intelligence, data mining, machine learning, (18 more...)

arXiv.org Artificial Intelligence

doi: 10.1109/TSP.2014.2333560

1311.4894

Country:

North America > United States > California > Los Angeles County > Los Angeles (0.14)
North America > United States > California > Monterey County > Pacific Grove (0.04)
North America > Canada > British Columbia > Metro Vancouver Regional District > Vancouver (0.04)
(15 more...)

Genre: Research Report (0.40)

Technology:

Information Technology > Communications > Networks (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Data Science > Data Mining (0.93)

arXiv.org Machine LearningNov-2-2013

Multivariate Generalized Gaussian Process Models

Chan, Antoni B.

We propose a family of multivariate Gaussian process models for correlated outputs, based on assuming that the likelihood function takes the generic form of the multivariate exponential family distribution (EFD). We denote this model as a multivariate generalized Gaussian process model, and derive Taylor and Laplace algorithms for approximate inference on the generic model. By instantiating the EFD with specific parameter functions, we obtain two novel GP models (and corresponding inference algorithms) for correlated outputs: 1) a Von-Mises GP for angle regression; and 2) a Dirichlet GP for regressing on the multinomial simplex.

approximation, artificial intelligence, machine learning, (15 more...)

1311.036

Country: Asia (0.28)

Genre: Research Report (0.40)

Technology:

Information Technology > Modeling & Simulation (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty (0.67)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.46)

arXiv.org Machine LearningNov-2-2013

Inferring clonal evolution of tumors from single nucleotide somatic mutations

Jiao, Wei, Vembu, Shankar, Deshwar, Amit G., Stein, Lincoln, Morris, Quaid

High-throughput sequencing allows the detection and quantification of frequencies of somatic single nucleotide variants (SNV) in heterogeneous tumor cell populations. In some cases, the evolutionary history and population frequency of the subclonal lineages of tumor cells present in the sample can be reconstructed from these SNV frequency measurements. However, automated methods to do this reconstruction are not available and the conditions under which reconstruction is possible have not been described. We describe the conditions under which the evolutionary history can be uniquely reconstructed from SNV frequencies from single or multiple samples from the tumor population and we introduce a new statistical model, PhyloSub, that infers the phylogeny and genotype of the major subclonal lineages represented in the population of cancer cells. It uses a Bayesian nonparametric prior over trees that groups SNVs into major subclonal lineages and automatically estimates the number of lineages and their ancestry. We sample from the joint posterior distribution over trees to identify evolutionary histories and cell population frequencies that have the highest probability of generating the observed SNV frequency data. When multiple phylogenies are consistent with a given set of SNV frequencies, PhyloSub represents the uncertainty in the tumor phylogeny using a partial order plot. Experiments on a simulated dataset and two real datasets comprising tumor samples from acute myeloid leukemia and chronic lymphocytic leukemia patients demonstrate that PhyloSub can infer both linear (or chain) and branching lineages and its inferences are in good agreement with ground truth, where it is available.

artificial intelligence, frequency, machine learning, (15 more...)

1210.3384

Country: North America > Canada > Ontario > Toronto (0.14)

Genre: Research Report (0.84)

Industry:

Health & Medicine > Therapeutic Area > Oncology > Leukemia (1.00)
Health & Medicine > Therapeutic Area > Hematology (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Clustering (0.68)

Franczak, Brian C., McNicholas, Paul D., Browne, Ryan P., Murray, Paula M.

Parsimonious Shifted Asymmetric Laplace Mixtures

arXiv.org Machine LearningNov-1-2013

A family of parsimonious shifted asymmetric Laplace mixture models is introduced. We extend the mixture of factor analyzers model to the shifted asymmetric Laplace distribution. Imposing constraints on the constitute parts of the resulting decomposed component scale matrices leads to a family of parsimonious models. An explicit two-stage parameter estimation procedure is described, and the Bayesian information criterion and the integrated completed likelihood are compared for model selection. This novel family of models is applied to real data, where it is compared to its Gaussian analogue within clustering and classification paradigms.

artificial intelligence, bayesian inference, machine learning, (16 more...)

1311.0317

Country:

Europe (0.67)
North America > Canada > Ontario (0.46)
North America > United States > California (0.28)

Genre: Research Report (0.82)

Industry: Health & Medicine > Pharmaceuticals & Biotechnology (0.68)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.88)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.88)