AITopics | Mathematical & Statistical Methods

Collaborating Authors

Mathematical & Statistical Methods

News Overviews Instructional Materials AI-Alerts Classics

Fast Newton-CG Method for Batch Learning of Conditional Random Fields

Tsuboi, Yuta (IBM Research - Tokyo) | Unno, Yuya (Preferred Infrastructure, Inc.) | Kashima, Hisashi (The University of Tokyo) | Okazaki, Naoaki (Tohoku University)

AAAI ConferencesAug-4-2011

We propose a fast batch learning method for linear-chain Conditional Random Fields (CRFs) based on Newton-CG methods. Newton-CG methods are a variant of Newton method for high-dimensional problems. They only require the Hessian-vector products instead of the full Hessian matrices. To speed up Newton-CG methods for the CRF learning, we derive a novel dynamic programming procedure for the Hessian-vector products of the CRF objective function. The proposed procedure can reuse the byproducts of the time-consuming gradient computation for the Hessian-vector products to drastically reduce the total computation time of the Newton-CG methods. In experiments with tasks in natural language processing, the proposed method outperforms a conventional quasi-Newton method. Remarkably, the proposed method is competitive with online learning algorithms that are fast but unstable.

algorithm, artificial intelligence, optimization problem, (19 more...)

AAAI Conferences

Twenty-Fifth AAAI Conference on Artificial Intelligence

Country:

Asia > Japan > Honshū > Kantō (0.15)
Asia > Japan > Honshū > Tōhoku (0.14)

Industry: Education (0.35)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.69)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.67)
(2 more...)

Add feedback

A random walk on image patches

Taylor, Kye M., Meyer, Francois G.

arXiv.org Machine LearningJul-2-2011

In this paper we address the problem of understanding the success of algorithms that organize patches according to graph-based metrics. Algorithms that analyze patches extracted from images or time series have led to state-of-the art techniques for classification, denoising, and the study of nonlinear dynamics. The main contribution of this work is to provide a theoretical explanation for the above experimental observations. Our approach relies on a detailed analysis of the commute time metric on prototypical graph models that epitomize the geometry observed in general patch graphs. We prove that a parametrization of the graph based on commute times shrinks the mutual distances between patches that correspond to rapid local changes in the signal, while the distances between patches that correspond to slow local changes expand. In effect, our results explain why the parametrization of the set of patches based on the eigenfunctions of the Laplacian can concentrate patches that correspond to rapid local changes, which would otherwise be shattered in the space of patches. While our results are based on a large sample analysis, numerical experimentations on synthetic and real data indicate that the results hold for datasets that are very small in practice.

artificial intelligence, graph, upstream oil & gas, (20 more...)

arXiv.org Machine Learning

1107.0414

Country: North America > United States > Colorado > Boulder County > Boulder (0.14)

Genre: Research Report > New Finding (1.00)

Industry: Energy > Oil & Gas > Upstream (0.68)

Technology:

Information Technology > Sensing and Signal Processing > Image Processing (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Data Science (0.93)
Information Technology > Artificial Intelligence > Representation & Reasoning > Mathematical & Statistical Methods (0.66)

Add feedback

Neyman-Pearson classification, convexity and stochastic constraints

Rigollet, Philippe, Tong, Xin

arXiv.org Machine LearningFeb-28-2011

Motivated by problems of anomaly detection, this paper implements the Neyman-Pearson paradigm to deal with asymmetric errors in binary classification with a convex loss. Given a finite collection of classifiers, we combine them and obtain a new classifier that satisfies simultaneously the two following properties with high probability: (i) its probability of type I error is below a pre-specified level and (ii), it has probability of type II error close to the minimum possible. The proposed classifier is obtained by solving an optimization problem with an empirical objective and an empirical constraint. New techniques to handle such problems are developed and have consequences on chance constrained programming.

classifier, constraint-based reasoning, optimization problem, (18 more...)

arXiv.org Machine Learning

1102.575

Country:

North America > United States (0.14)
Europe (0.14)

Genre: Research Report (0.50)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.68)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.68)
Information Technology > Artificial Intelligence > Representation & Reasoning > Constraint-Based Reasoning (0.64)
Information Technology > Artificial Intelligence > Representation & Reasoning > Mathematical & Statistical Methods (0.46)

Add feedback

Randomized algorithms for statistical image analysis and site percolation on square lattices

Langovoy, Mikhail A., Wittich, Olaf

arXiv.org Machine LearningFeb-24-2011

We propose a novel probabilistic method for detection of objects in noisy images. The method uses results from percolation and random graph theories. We present an algorithm that allows to detect objects of unknown shapes in the presence of random noise. The algorithm has linear complexity and exponential accuracy and is appropriate for real-time systems. We prove results on consistency and algorithmic complexity of our procedure.

artificial intelligence, health & medicine, noisy image, (16 more...)

arXiv.org Machine Learning

doi: 10.1111/stan.12010

1102.5014

Country:

Europe > Netherlands (0.28)
North America > United States > Massachusetts (0.14)

Genre: Research Report (0.50)

Industry: Health & Medicine (0.46)

Technology: Information Technology > Artificial Intelligence > Representation & Reasoning > Mathematical & Statistical Methods (0.68)

Add feedback

Detection of objects in noisy images based on percolation theory

Davies, Patrick Laurie, Langovoy, Mikhail A., Wittich, Olaf

arXiv.org Machine LearningFeb-24-2011

We propose a novel statistical method for detection of objects in noisy images. The method uses results from percolation and random graph theories. We present an algorithm that allows to detect objects of unknown shapes in the presence of nonparametric noise of unknown level. The noise density is assumed to be unknown and can be very irregular. Our procedure substantially differs from wavelets-based algorithms. The algorithm has linear complexity and exponential accuracy and is appropriate for real-time systems. We prove results on consistency and algorithmic complexity of our procedure.

artificial intelligence, health & medicine, noisy image, (15 more...)

arXiv.org Machine Learning

1102.5019

Country:

North America > United States > Massachusetts (0.14)
North America > United States > California (0.14)

Genre: Research Report (0.50)

Industry: Health & Medicine (0.46)

Technology: Information Technology > Artificial Intelligence > Representation & Reasoning > Mathematical & Statistical Methods (0.68)

Add feedback

Computationally efficient algorithms for statistical image processing. Implementation in R

Langovoy, Mikhail A., Wittich, Olaf

arXiv.org Machine LearningFeb-23-2011

In the series of our earlier papers on the subject, we proposed a novel statistical hypothesis testing method for detection of objects in noisy images. The method uses results from percolation theory and random graph theory. We developed algorithms that allowed to detect objects of unknown shapes in the presence of nonparametric noise of unknown level and of unknown distribution. No boundary shape constraints were imposed on the objects, only a weak bulk condition for the object's interior was required. Our algorithms have linear complexity and exponential accuracy. In the present paper, we describe an implementation of our nonparametric hypothesis testing method. We provide a program that can be used for statistical experiments in image processing. This program is written in the statistical programming language R.

algorithm, artificial intelligence, wittich image processing, (14 more...)

arXiv.org Machine Learning

1102.4816

Country: Europe > Netherlands (0.15)

Genre: Research Report (0.50)

Technology: Information Technology > Artificial Intelligence > Representation & Reasoning > Mathematical & Statistical Methods (0.68)

Add feedback

MAP estimation in Binary MRFs via Bipartite Multi-cuts

Reddi, Sashank J., Sarawagi, Sunita, Vishwanathan, Sundar

Neural Information Processing SystemsDec-31-2010

We propose a new LP relaxation for obtaining the MAP assignment of a binary MRF with pairwise potentials. Our relaxation is derived from reducing the MAP assignment problem to an instance of a recently proposed Bipartite Multi-cut problem where the LP relaxation is guaranteed to provide an O(log k) approximation where k is the number of vertices adjacent to non-submodular edges in the MRF. We then propose a combinatorial algorithm to efficiently solve the LP and also provide a lower bound by concurrently solving its dual to within an ɛ approximation. The algorithm is up to an order of magnitude faster and provides better MAP scores and bounds than the state of the art message passing algorithm of [1] that tightens the local marginal polytope with third-order marginal constraints.

algorithm, artificial intelligence, optimization problem, (17 more...)

Neural Information Processing Systems

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.49)
Information Technology > Artificial Intelligence > Representation & Reasoning > Mathematical & Statistical Methods (0.35)

Add feedback

Optimal Web-Scale Tiering as a Flow Problem

Leung, Gilbert, Quadrianto, Novi, Tsioutsiouliklis, Kostas, Smola, Alex J.

Neural Information Processing SystemsDec-31-2010

We present a fast online solver for large scale maximum-flow problems as they occur in portfolio optimization, inventory management, computer vision, and logistics. Our algorithm solves an integer linear program in an online fashion. It exploits total unimodularity of the constraint matrix and a Lagrangian relaxation to solve the problem as a convex online game. The algorithm generates approximate solutions of max-flow problems by performing stochastic gradient descent on a set of flows. We apply the algorithm to optimize tier arrangement of over 80 Million web pages on a layered set of caches to serve an incoming query stream optimally. We provide an empirical demonstration of the effectiveness of our method on real query-pages data.

artificial intelligence, optimization problem, query, (17 more...)

Neural Information Processing Systems

Country:

North America > United States > California > Santa Clara County (0.46)
North America > United States > Massachusetts > Middlesex County > Cambridge (0.14)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Mathematical & Statistical Methods (0.70)
Information Technology > Artificial Intelligence > Natural Language > Information Retrieval (0.70)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Gradient Descent (0.55)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.48)

Add feedback

Guaranteed Rank Minimization via Singular Value Projection

Jain, Prateek, Meka, Raghu, Dhillon, Inderjit S.

Neural Information Processing SystemsDec-31-2010

Minimizing the rank of a matrix subject to affine constraints is a fundamental problem with many important applications in machine learning and statistics. In this paper we propose a simple and fast algorithm SVP (Singular Value Projection) for rank minimization under affine constraints ARMP and show that SVP recovers the minimum rank solution for affine constraints that satisfy a Restricted Isometry Property} (RIP). Our method guarantees geometric convergence rate even in the presence of noise and requires strictly weaker assumptions on the RIP constants than the existing methods. We also introduce a Newton-step for our SVP framework to speed-up the convergence with substantial empirical gains. Next, we address a practically important application of ARMP - the problem of low-rank matrix completion, for which the defining affine constraints do not directly obey RIP, hence the guarantees of SVP do not hold. However, we provide partial progress towards a proof of exact recovery for our algorithm by showing a more restricted isometry property and observe empirically that our algorithm recovers low-rank Incoherent matrices from an almost optimal number of uniformly sampled entries. We also demonstrate empirically that our algorithms outperform existing methods, such as those of \cite{CaiCS2008,LeeB2009b, KeshavanOM2009}, for ARMP and the matrix completion problem by an order of magnitude and are also more robust to noise and sampling schemes. In particular, results show that our SVP-Newton method is significantly robust to noise and performs impressively on a more realistic power-law sampling scheme for the matrix completion problem.

artificial intelligence, matrix, optimization problem, (16 more...)

Neural Information Processing Systems

Country: North America > United States > Virginia (0.14)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Mathematical & Statistical Methods (0.34)

Add feedback

Exact learning curves for Gaussian process regression on large random graphs

Urry, Matthew, Sollich, Peter

Neural Information Processing SystemsDec-31-2010

We study learning curves for Gaussian process regression which characterise performance interms of the Bayes error averaged over datasets of a given size. Whilst learning curves are in general very difficult to calculate we show that for discrete input domains, where similarity between input points is characterised in terms of a graph, accurate predictions can be obtained. These should in fact become exact for large graphs drawn from a broad range of random graph ensembles with arbitrary degree distributions where each input (node) is connected only to a finite number of others. Our approach is based on translating the appropriate belief propagation equations to the graph ensemble. We demonstrate the accuracy of the predictions for Poisson (Erdos-Renyi) and regular random graphs, and discuss when and why previous approximations of the learning curve fail.

artificial intelligence, graph, machine learning, (17 more...)

Neural Information Processing Systems

Country:

North America > United States (0.14)
Asia > China (0.14)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Mathematical & Statistical Methods (0.82)

Add feedback