AITopics

1203.3051

Country:

Oceania > Australia > New South Wales > Sydney (0.04)
North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)
Europe > United Kingdom > England (0.04)

Genre: Research Report (0.82)

Technology: Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.34)

arXiv.org Machine LearningMar-14-2012

A Proximal-Gradient Homotopy Method for the Sparse Least-Squares Problem

Xiao, Lin, Zhang, Tong

We consider solving the $\ell_1$-regularized least-squares ($\ell_1$-LS) problem in the context of sparse recovery, for applications such as compressed sensing. The standard proximal gradient method, also known as iterative soft-thresholding when applied to this problem, has low computational cost per iteration but a rather slow convergence rate. Nevertheless, when the solution is sparse, it often exhibits fast linear convergence in the final stage. We exploit the local linear convergence using a homotopy continuation strategy, i.e., we solve the $\ell_1$-LS problem for a sequence of decreasing values of the regularization parameter, and use an approximate solution at the end of each stage to warm start the next stage. Although similar strategies have been studied in the literature, there have been no theoretical analysis of their global iteration complexity. This paper shows that under suitable assumptions for sparse recovery, the proposed homotopy strategy ensures that all iterates along the homotopy solution path are sparse. Therefore the objective function is effectively strongly convex along the solution path, and geometric convergence at each stage can be established. As a result, the overall iteration complexity of our method is $O(\log(1/\epsilon))$ for finding an $\epsilon$-optimal solution, which can be interpreted as global geometric rate of convergence. We also present empirical results to support our theoretical analysis.

artificial intelligence, convergence, machine learning, (19 more...)

1203.3002

Genre: Research Report (0.50)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (0.93)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.88)

arXiv.org Artificial IntelligenceMar-11-2012

Scaling Inference for Markov Logic with a Task-Decomposition Approach

Niu, Feng, Zhang, Ce, Ré, Christopher, Shavlik, Jude

Motivated by applications in large-scale knowledge base construction, we study the problem of scaling up a sophisticated statistical inference framework called Markov Logic Networks (MLNs). Our approach, Felix, uses the idea of Lagrangian relaxation from mathematical programming to decompose a program into smaller tasks while preserving the joint-inference property of the original MLN. The advantage is that we can use highly scalable specialized algorithms for common tasks such as classification and coreference. We propose an architecture to support Lagrangian relaxation in an RDBMS which we show enables scalable joint inference for MLNs. We empirically validate that Felix is significantly more scalable and efficient than prior approaches to MLN inference by constructing a knowledge base from 1.8M documents as part of the TAC challenge. We show that Felix scales and achieves state-of-the-art quality numbers. In contrast, prior approaches do not scale even to a subset of the corpus that is three orders of magnitude smaller.

artificial intelligence, machine learning, natural language, (22 more...)

1108.0294

Country:

North America > United States > Wisconsin > Dane County > Madison (0.04)
North America > United States > Pennsylvania > Allegheny County > Pittsburgh (0.04)
North America > United States > New York (0.04)
(3 more...)

Genre: Research Report > New Finding (0.67)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Expert Systems (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (1.00)
(3 more...)

de Freitas, Nando, Smola, Alex, Zoghi, Masrour

Regret Bounds for Deterministic Gaussian Process Bandits

arXiv.org Machine LearningMar-9-2012

This paper analyses the problem of Gaussian process (GP) bandits with deterministic observations. The analysis uses a branch and bound algorithm that is related to the UCB algorithm of (Srinivas et al., 2010). For GPs with Gaussian observation noise, with variance strictly greater than zero, (Srinivas et al., 2010) proved that the regret vanishes at the approximate rate of $O(\frac{1}{\sqrt{t}})$, where t is the number of observations. To complement their result, we attack the deterministic case and attain a much faster exponential convergence rate. Under some regularity assumptions, we show that the regret decreases asymptotically according to $O(e^{-\frac{\tau t}{(\ln t)^{d/4}}})$ with high probability. Here, d is the dimension of the search space and $\tau$ is a constant that depends on the behaviour of the objective function near its global maximum.

algorithm, artificial intelligence, machine learning, (18 more...)

1203.2177

Country: North America > Canada > Alberta (0.28)

Genre: Research Report (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Search (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.94)

Wahlberg, Bo, Boyd, Stephen, Annergren, Mariette, Wang, Yang

An ADMM Algorithm for a Class of Total Variation Regularized Estimation Problems

arXiv.org Machine LearningMar-8-2012

We present an alternating augmented Lagrangian method for convex optimization problems where the cost function is the sum of two terms, one that is separable in the variable blocks, and a second that is separable in the difference between consecutive variable blocks. Examples of such problems include Fused Lasso estimation, total variation denoising, and multi-period portfolio optimization with transaction costs. In each iteration of our method, the first step involves separately optimizing over each variable block, which can be carried out in parallel. The second step is not separable in the variables, but can be carried out very efficiently. We apply the algorithm to segmentation of data based on changes inmean (l_1 mean filtering) or changes in variance (l_1 variance filtering). In a numerical example, we show that our implementation is around 10000 times faster compared with the generic optimization solver SDPT3.

algorithm, artificial intelligence, machine learning, (17 more...)

1203.1828

Country: Europe > Sweden (0.28)

Genre: Research Report (0.40)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (0.94)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.92)

Agarwal, Alekh, Negahban, Sahand N., Wainwright, Martin J.

Noisy matrix decomposition via convex relaxation: Optimal rates in high dimensions

arXiv.org Machine LearningMar-6-2012

We analyze a class of estimators based on convex relaxation for solving high-dimensional matrix decomposition problems. The observations are noisy realizations of a linear transformation $\mathfrak{X}$ of the sum of an approximately) low rank matrix $\Theta^\star$ with a second matrix $\Gamma^\star$ endowed with a complementary form of low-dimensional structure; this set-up includes many statistical models of interest, including factor analysis, multi-task regression, and robust covariance estimation. We derive a general theorem that bounds the Frobenius norm error for an estimate of the pair $(\Theta^\star, \Gamma^\star)$ obtained by solving a convex optimization problem that combines the nuclear norm with a general decomposable regularizer. Our results utilize a "spikiness" condition that is related to but milder than singular vector incoherence. We specialize our general result to two cases that have been studied in past work: low rank plus an entrywise sparse matrix, and low rank plus a columnwise sparse matrix. For both models, our theory yields non-asymptotic Frobenius error bounds for both deterministic and stochastic noise matrices, and applies to matrices $\Theta^\star$ that can be exactly or approximately low rank, and matrices $\Gamma^\star$ that can be exactly or approximately sparse. Moreover, for the case of stochastic noise matrices and the identity observation operator, we establish matching lower bounds on the minimax error. The sharpness of our predictions is confirmed by numerical simulations.

artificial intelligence, machine learning, optimization problem, (18 more...)

doi: 10.1214/12-AOS1000

1102.4807

Country: North America > United States (1.00)

Genre: Research Report > New Finding (0.34)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.67)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.48)

AAAI ConferencesFeb-22-2012

Learning the Nature of Information in Social Networks

Agrawal, Rakesh (Microsoft) | Potamias, Michalis (Groupon) | Terzi, Evimaria (Boston University)

We postulate that the nature of information items plays a vital role in the observed spread of these items in a social network. We capture this intuition by proposing a model that assigns to every information item two parameters: endogeneity and exogeneity. The endogeneity of the item quantifies its tendency to spread primarily through the connections between nodes; the exogeneity quantifies its tendency to be acquired by the nodes, independently of the underlying network. We also extend this item-based model to take into account the openness of each node to new information. We quantify openness by introducing the receptivity of a node. Given a social network and data related to the ordering of adoption of information items by nodes, we develop a maximum-likelihood framework for estimating endogeneity, exogeneity and receptivity parameters. We apply our methodology to synthetic and real data and demonstrate its efficacy as a data-analytic tool.

exogeneity, information item, node, (16 more...)

AAAI Conferences

Sixth International AAAI Conference on Weblogs and Social Media

Country: North America > United States (0.14)

Industry:

Information Technology > Services (0.82)
Government (0.68)

Technology:

Information Technology > Communications > Social Media (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.46)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.34)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.34)

Recht, Benjamin, Re, Christopher

Beneath the valley of the noncommutative arithmetic-geometric mean inequality: conjectures, case-studies, and consequences

arXiv.org Machine LearningFeb-19-2012

Randomized algorithms that base iteration-level decisions on samples from some pool are ubiquitous in machine learning and optimization. Examples include stochastic gradient descent and randomized coordinate descent. This paper makes progress at theoretically evaluating the difference in performance between sampling with- and without-replacement in such algorithms. Focusing on least means squares optimization, we formulate a noncommutative arithmetic-geometric mean inequality that would prove that the expected convergence rate of without-replacement sampling is faster than that of with-replacement sampling. We demonstrate that this inequality holds for many classes of random matrices and for some pathological examples as well. We provide a deterministic worst-case bound on the gap between the discrepancy between the two sampling models, and explore some of the impediments to proving this inequality in full generality. We detail the consequences of this inequality for stochastic gradient descent and the randomized Kaczmarz algorithm for solving linear systems.

artificial intelligence, inequality, machine learning, (15 more...)

1202.4184

Country: North America > United States > Wisconsin (0.28)

Genre: Research Report (0.50)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Gradient Descent (0.95)
Information Technology > Artificial Intelligence > Representation & Reasoning > Mathematical & Statistical Methods (0.69)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.68)

Kumar, Akshat, Zilberstein, Shlomo

Message-Passing Algorithms for Quadratic Programming Formulations of MAP Estimation

arXiv.org Artificial IntelligenceFeb-14-2012

Computing maximum a posteriori (MAP) estimation in graphical models is an important inference problem with many applications. We present message-passing algorithms for quadratic programming (QP) formulations of MAP estimation for pairwise Markov random fields. In particular, we use the concave-convex procedure (CCCP) to obtain a locally optimal algorithm for the non-convex QP formulation. A similar technique is used to derive a globally convergent algorithm for the convex QP relaxation of MAP. We also show that a recently developed expectation-maximization (EM) algorithm for the QP formulation of MAP can be derived from the CCCP perspective. Experiments on synthetic and real-world problems confirm that our new approach is competitive with max-product and its variations. Compared with CPLEX, we achieve more than an order-of-magnitude speedup in solving optimally the convex QP relaxation.

artificial intelligence, constraint, machine learning, (19 more...)

1202.3739

Country:

Asia > Middle East > Jordan (0.05)
North America > United States > Massachusetts > Hampshire County > Amherst (0.04)
North America > United States > New York > New York County > New York City (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)

Genre: Research Report (0.40)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)

Sanner, Scott, Delgado, Karina Valdivia, de Barros, Leliane Nunes

Symbolic Dynamic Programming for Discrete and Continuous State MDPs

arXiv.org Artificial IntelligenceFeb-14-2012

Many real-world decision-theoretic planning problems can be naturally modeled with discrete and continuous state Markov decision processes (DC-MDPs). While previous work has addressed automated decision-theoretic planning for DCMDPs, optimal solutions have only been defined so far for limited settings, e.g., DC-MDPs having hyper-rectangular piecewise linear value functions. In this work, we extend symbolic dynamic programming (SDP) techniques to provide optimal solutions for a vastly expanded class of DCMDPs. To address the inherent combinatorial aspects of SDP, we introduce the XADD - a continuous variable extension of the algebraic decision diagram (ADD) - that maintains compact representations of the exact value function. Empirically, we demonstrate an implementation of SDP with XADDs on various DC-MDPs, showing the first optimal automated solutions to DCMDPs with linear and nonlinear piecewise partitioned value functions and showing the advantages of constraint-based pruning for XADDs.

artificial intelligence, bayesian inference, machine learning, (20 more...)

1202.3762

Country:

South America > Brazil > São Paulo (0.05)
Oceania > Australia > Australian Capital Territory > Canberra (0.04)
North America > United States > New Jersey > Mercer County > Princeton (0.04)
Asia > Japan > Honshū > Chūbu > Ishikawa Prefecture > Kanazawa (0.04)

Genre: Research Report > New Finding (0.68)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.48)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.46)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.46)