AITopics | Mathematical & Statistical Methods

Collaborating Authors

Mathematical & Statistical Methods

News Overviews Instructional Materials AI-Alerts Classics

Determinantal point processes for machine learning

arXiv.org Machine LearningJan-10-2013

Determinantal point processes (DPPs) are elegant probabilistic models of repulsion that arise in quantum physics and random matrix theory. In contrast to traditional structured models like Markov random fields, which become intractable and hard to approximate in the presence of negative correlations, DPPs offer efficient and exact algorithms for sampling, marginalization, conditioning, and other inference tasks. We provide a gentle introduction to DPPs, focusing on the intuitions, algorithms, and extensions that are most relevant to the machine learning community, and show how DPPs can be applied to real-world applications like finding diverse sets of high-quality search results, building informative summaries by selecting diverse sentences from documents, modeling non-overlapping human poses in images or video, and automatically building timelines of important news stories.

information retrieval, machine learning, natural language, (23 more...)

arXiv.org Machine Learning

doi: 10.1561/2200000044

1207.6083

Country:

North America > Mexico (0.46)
North America > United States > New York (0.04)
North America > United States > Arizona (0.04)
(25 more...)

Genre:

Overview (1.00)
Research Report > New Finding (0.45)
Research Report > Experimental Study (0.45)

Industry:

Law Enforcement & Public Safety > Crime Prevention & Enforcement (1.00)
Health & Medicine > Therapeutic Area (1.00)
Government > Regional Government > North America Government > United States Government (1.00)
(6 more...)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Search (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (1.00)
(6 more...)

Add feedback

Interpreting prediction markets: a stochastic approach

Frongillo, Rafael M., Penna, Nicholás Della, Reid, Mark D.

Neural Information Processing SystemsDec-31-2012

We strengthen recent connections between prediction markets and learning byshowing that a natural class of market makers can be understood as performing stochastic mirror descent when trader demands are sequentially drawnfrom a fixed distribution. This provides new insights into how market prices (and price paths) may be interpreted as a summary of the market's belief distribution by relating them to the optimization problem being solved. In particular, we show that under certain conditions the stationary pointof the stochastic process of prices generated by the market is equal to the market's Walrasian equilibrium of classic market analysis. Together, these results suggest how traditional market making mechanisms might be replaced with general purpose learning algorithms while still retaining guaranteesabout their behaviour.

artificial intelligence, machine learning, prediction market, (19 more...)

Neural Information Processing Systems

Industry: Banking & Finance > Trading > Prediction Market (0.63)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (0.47)
Information Technology > Artificial Intelligence > Representation & Reasoning > Mathematical & Statistical Methods (0.34)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.34)

Add feedback

A Stochastic Gradient Method with an Exponential Convergence _Rate for Finite Training Sets

Roux, Nicolas L., Schmidt, Mark, Bach, Francis R.

Neural Information Processing SystemsDec-31-2012

We propose a new stochastic gradient method for optimizing the sum of  a finite set of smooth functions, where the sum is strongly convex.  While standard stochastic gradient methods  converge at sublinear rates for this problem, the proposed method incorporates a memory of previous gradient values in order to achieve a linear convergence  rate. In a machine learning context, numerical experiments indicate that the new algorithm can dramatically outperform standard  algorithms, both in terms of optimizing the training error and reducing the test error quickly.

artificial intelligence, convergence rate, machine learning, (17 more...)

Neural Information Processing Systems

Country: Europe > France (0.14)

Genre: Research Report > New Finding (0.47)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Mathematical & Statistical Methods (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Gradient Descent (0.91)

Add feedback

Wavelet based multi-scale shape features on arbitrary surfaces for cortical thickness discrimination

Kim, Won H., Pachauri, Deepti, Hatt, Charles, Chung, Moo. K., Johnson, Sterling, Singh, Vikas

Neural Information Processing SystemsDec-31-2012

Hypothesis testing on signals deﬁned on surfaces (such as the cortical surface) is a fundamental component of a variety of studies in Neuroscience. The goal here is to identify regions that exhibit changes as a function of the clinical condition under study. As the clinical questions of interest move towards identifying very early signs of diseases, the corresponding statistical differences at the group level invariably become weaker and increasingly hard to identify. Indeed, after a multiple comparisons correction is adopted (to account for correlated statistical tests over all surface points), very few regions may survive. In contrast to hypothesis tests on point-wise measurements, in this paper, we make the case for performing statistical analysis on multi-scale shape descriptors that characterize the local topological context of the signal around each surface vertex. Our descriptors are based on recent results from harmonic analysis, that show how wavelet theory extends to non-Euclidean settings (i.e., irregular weighted graphs). We provide strong evidence that these descriptors successfully pick up group-wise differences, where traditional methods either fail or yield unsatisfactory results. Other than this primary application, we show how the framework allows performing cortical surface smoothing in the native space without mappint to a unit sphere.

artificial intelligence, machine learning, vertex, (17 more...)

Neural Information Processing Systems

Country: North America > United States > Wisconsin > Dane County > Madison (0.15)

Genre: Research Report (0.95)

Industry: Health & Medicine > Therapeutic Area > Neurology > Alzheimer's Disease (0.50)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (0.94)
Information Technology > Artificial Intelligence > Representation & Reasoning > Mathematical & Statistical Methods (0.46)

Add feedback

Factoring nonnegative matrices with linear programs

Recht, Ben, Re, Christopher, Tropp, Joel, Bittorf, Victor

Neural Information Processing SystemsDec-31-2012

This paper describes a new approach for computing nonnegative matrix factorizations (NMFs) with linear programming. The key idea is a data-driven model for the factorization, in which the most salient features in the data are used to express the remaining features. More precisely, given a data matrix X, the algorithm identifies a matrix C that satisfies X = CX and some linear constraints. The matrix C selects features, which are then used to compute a low-rank NMF of X. A theoretical analysis demonstrates that this approach has the same type of guarantees as the recent NMF algorithm of Arora et al.~(2012). In contrast with this earlier work, the proposed method has (1) better noise tolerance, (2) extends to more general noise models, and (3) leads to efficient, scalable algorithms. Experiments with synthetic and real datasets provide evidence that the new approach is also superior in practice. An optimized C++ implementation of the new algorithm can factor a multi-Gigabyte matrix in a matter of minutes.

artificial intelligence, machine learning, natural language, (17 more...)

Neural Information Processing Systems

Country: North America > United States (0.46)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.69)
Information Technology > Artificial Intelligence > Representation & Reasoning > Mathematical & Statistical Methods (0.51)

Add feedback

The Lovász ϑ function, SVMs and finding large dense subgraphs

Jethava, Vinay, Martinsson, Anders, Bhattacharyya, Chiranjib, Dubhashi, Devdatt

Neural Information Processing SystemsDec-31-2012

The Lovasz $\theta$ function of a graph, is a fundamental tool in combinatorial optimization and approximation algorithms. Computing $\theta$ involves solving a SDP and is extremely expensive even for moderately sized graphs. In this paper we establish that the Lovasz $\theta$ function is equivalent to a kernel learning problem related to one class SVM. This interesting connection opens up many opportunities bridging graph theoretic algorithms and machine learning. We show that there exist graphs, which we call $SVM-\theta$ graphs, on which the Lovasz $\theta$ function can be approximated well by a one-class SVM. This leads to a novel use of SVM techniques to solve algorithmic problems in large graphs e.g. identifying a planted clique of size $\Theta({\sqrt{n}})$ in a random graph $G(n,\frac{1}{2})$. A classic approach for this problem involves computing the $\theta$ function, however it is not scalable due to SDP computation. We show that the random graph with a planted clique is an example of $SVM-\theta$ graph, and as a consequence a SVM based approach easily identifies the clique in large graphs and is competitive with the state-of-the-art. Further, we introduce the notion of a ''common orthogonal labeling'' which extends the notion of a ''orthogonal labelling of a single graph (used in defining the $\theta$ function) to multiple graphs. The problem of finding the optimal common orthogonal labelling is cast as a Multiple Kernel Learning problem and is used to identify a large common dense region in multiple graphs. The proposed algorithm achieves an order of magnitude scalability compared to the state of the art.

artificial intelligence, graph, machine learning, (17 more...)

Neural Information Processing Systems

Country:

Europe (0.47)
North America > United States (0.46)

Industry: Education > Focused Education > Special Education (0.44)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Search (0.66)
Information Technology > Artificial Intelligence > Representation & Reasoning > Mathematical & Statistical Methods (0.56)

Add feedback

Newton-Like Methods for Sparse Inverse Covariance Estimation

Oztoprak, Figen, Nocedal, Jorge, Rennie, Steven, Olsen, Peder A.

Neural Information Processing SystemsDec-31-2012

We propose two classes of second-order optimization methods for solving the sparse inverse covariance estimation problem. The first approach, which we call the Newton-LASSO method, minimizes a piecewise quadratic model of the objective function at every iteration to generate a step. We employ the fast iterative shrinkage thresholding method (FISTA) to solve this subproblem. The second approach, which we call the Orthant-Based Newton method, is a two-phase algorithm that first identifies an orthant face and then minimizes a smooth quadratic approximation of the objective function using the conjugate gradient method. These methods exploit the structure of the Hessian to efficiently compute the search direction and to avoid explicitly storing the Hessian. We show that quasi-Newton methods are also effective in this context, and describe a limited memory BFGS variant of the orthant-based Newton method. We present numerical results that suggest that all the techniques described in this paper have attractive properties and constitute useful tools for solving the sparse inverse covariance estimation problem. Comparisons with the method implemented in the QUIC software package are presented.

algorithm, artificial intelligence, optimization problem, (15 more...)

Neural Information Processing Systems

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Mathematical & Statistical Methods (0.75)

Add feedback

Spectral Estimation of Conditional Random Graph Models for Large-Scale Network Data

Freno, Antonino, Keller, Mikaela, Garriga, Gemma C., Tommasi, Marc

arXiv.org Machine LearningOct-16-2012

Generative models for graphs have been typically committed to strong prior assumptions concerning the form of the modeled distributions. Moreover, the vast majority of currently available models are either only suitable for characterizing some particular network properties (such as degree distribution or clustering coefficient), or they are aimed at estimating joint probability distributions, which is often intractable in large-scale networks. In this paper, we first propose a novel network statistic, based on the Laplacian spectrum of graphs, which allows to dispense with any parametric assumption concerning the modeled network properties. Second, we use the defined statistic to develop the Fiedler random graph model, switching the focus from the estimation of joint probability distributions to a more tractable conditional estimation setting. After analyzing the dependence structure characterizing Fiedler random graphs, we evaluate them experimentally in edge prediction over several real-world networks, showing that they allow to reach a much higher prediction accuracy than various alternative statistical models.

artificial intelligence, graph, machine learning, (16 more...)

arXiv.org Machine Learning

1210.486

Country:

North America > United States > New York > New York County > New York City (0.04)
Europe > France (0.04)
North America > United States > Pennsylvania > Philadelphia County > Philadelphia (0.04)
(3 more...)

Genre: Research Report (0.82)

Industry:

Telecommunications > Networks (0.50)
Information Technology > Networks (0.50)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Mathematical & Statistical Methods (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.73)

Add feedback

An example illustrating the imprecision of the efficient approach for diagnosis of Petri nets via integer linear programming

Grastien, Alban

arXiv.org Artificial IntelligenceOct-15-2012

This document demonstrates that the efficient approach for diagnosis of Petri nets via integer linear programming may be unable to detect a fault even if the system is diagnosable.

artificial intelligence, optimization problem, transition, (17 more...)

arXiv.org Artificial Intelligence

1210.4231

Country: Oceania > Australia > Australian Capital Territory > Canberra (0.05)

Genre: Research Report (0.50)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.66)
Information Technology > Artificial Intelligence > Representation & Reasoning > Mathematical & Statistical Methods (0.66)

Add feedback

Intelligent Computation of Reachability Sets for Space Missions

Komendera, Erik Edmund (University of Colorado - Boulder) | Scheeres, Daniel (University of Colorado - Boulder) | Bradley, Elizabeth (University of Colorado - Boulder)

AAAI ConferencesJul-21-2012

This paper introduces a new technique for intelligently exploring the reachability set of a spacecraft: the set of trajectories from a given initial condition that are possible under a specified range of control actions. The high dimension of this problem and the nonlinear nature of gravitational interactions make the geometry of these sets complicated, hard to compute, and all but impossible to visualize. Currently, exploration of a problem’s state space is done heuristically, based on previously identified solutions. This potentially misses out on improved mission design solutions that are not close to previous approaches. The goal of the work described here is to map out reachability sets automatically. This would not only aid human mission planners, but also allow a spacecraft to determine its own course without input from Earth-based controllers. Brute-force approaches to this are computationally prohibitive, so one must focus the effort on regions that are of interest: where neighboring trajectories diverge quickly, for instance, or come close to a body that the spacecraft is orbiting. In this paper, we focus on the first of those two criteria; the goal is to identify regions in the system’s state space where small changes have large effects— or vice versa—and concentrate the computational mesh accordingly.

mesh, reachability, trajectory, (17 more...)

AAAI Conferences

Twenty-Fourth IAAI Conference

Country:

North America > United States > California > San Francisco County > San Francisco (0.14)
North America > United States > Colorado (0.04)
North America > United States > New York (0.04)

Technology: Information Technology > Artificial Intelligence > Representation & Reasoning > Mathematical & Statistical Methods (0.94)

Add feedback