AITopics

1309.6301

Country: Europe (0.46)

Genre: Research Report (0.64)

Industry:

Media > Film (0.55)
Leisure & Entertainment (0.55)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.93)

Nagano, Kiyohito, Kawahara, Yoshinobu

Structured Convex Optimization under Submodular Constraints

A number of discrete and continuous optimization problems in machine learning are related to convex minimization problems under submodular constraints. In this paper, we deal with a submodular function with a directed graph structure, and we show that a wide range of convex optimization problems under submodular constraints can be solved much more efficiently than general submodular optimization methods by a reduction to a maximum flow problem. Furthermore, we give some applications, including sparse optimization methods, in which the proposed methods are effective. Additionally, we evaluate the performance of the proposed method through computational experiments.

algorithm, artificial intelligence, optimization problem, (14 more...)

1309.685

Country: Asia > Japan > Honshū (0.69)

Genre: Research Report (0.40)

Technology: Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (1.00)

Learning Max-Margin Tree Predictors

Meshi, Ofer, Eban, Elad, Elidan, Gal, Globerson, Amir

Structured prediction is a powerful framework for coping with joint prediction of interacting outputs. A central difficulty in using this framework is that often the correct label dependence structure is unknown. At the same time, we would like to avoid an overly complex structure that will lead to intractable prediction. In this work we address the challenge of learning tree structured predictive models that achieve high accuracy while at the same time facilitate efficient (linear time) inference. We start by proving that this task is in general NP-hard, and then suggest an approximate alternative. Briefly, our CRANK approach relies on a novel Circuit-RANK regularizer that penalizes non-tree structures and that can be optimized using a CCCP procedure. We demonstrate the effectiveness of our approach on several domains and show that, despite the relative simplicity of the structure, prediction accuracy is competitive with a fully connected model that is computationally costly at prediction time.

artificial intelligence, inductive learning, machine learning, (20 more...)

1309.6847

Country:

North America > United States (0.28)
Asia > Middle East (0.28)

Genre: Research Report (0.82)

Industry: Health & Medicine (0.46)

Technology:

Information Technology > Data Science (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.93)
(2 more...)

Honorio, Jean, Jaakkola, Tommi S.

Inverse Covariance Estimation for High-Dimensional Data in Linear Time and Space: Spectral Methods for Riccati and Sparse Models

We propose maximum likelihood estimation for learning Gaussian graphical models with a Gaussian (ell_2^2) prior on the parameters. This is in contrast to the commonly used Laplace (ell_1) prior for encouraging sparseness. We show that our optimization problem leads to a Riccati matrix equation, which has a closed form solution. We propose an efficient algorithm that performs a singular value decomposition of the training data. Our algorithm is O(NT^2)-time and O(NT)-space for N variables and T samples. Our method is tailored to high-dimensional problems (N gg T), in which sparseness promoting methods become intractable. Furthermore, instead of obtaining a single solution for a specific regularization parameter, our algorithm finds the whole solution path. We show that the method has logarithmic sample complexity under the spiked covariance model. We also propose sparsification of the dense solution with provable performance guarantees. We provide techniques for using our learnt models, such as removing unimportant variables, computing likelihoods and conditional distributions. Finally, we show promising results in several gene expressions datasets.

artificial intelligence, bayesian inference, machine learning, (13 more...)

1309.6838

Country: North America > United States (0.28)

Genre: Research Report (0.64)

Industry: Health & Medicine > Therapeutic Area > Oncology (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.90)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.89)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.55)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.55)

Cheng, Hao, Zhang, Xinhua, Schuurmans, Dale

Convex Relaxations of Bregman Divergence Clustering

Although many convex relaxations of clustering have been proposed in the past decade, current formulations remain restricted to spherical Gaussian or discriminative models and are susceptible to imbalanced clusters. To address these shortcomings, we propose a new class of convex relaxations that can be flexibly applied to more general forms of Bregman divergence clustering. By basing these new formulations on normalized equivalence relations we retain additional control on relaxation quality, which allows improvement in clustering quality. We furthermore develop optimization methods that improve scalability by exploiting recent implicit matrix norm methods. In practice, we find that the new formulations are able to efficiently produce tighter clusterings that improve the accuracy of state of the art methods.

artificial intelligence, machine learning, relaxation, (18 more...)

1309.6823

Country: North America (0.46)

Genre: Research Report (0.70)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models (0.93)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.88)

Cai, T. Tony, Zhou, Wen-Xin

A Max-Norm Constrained Minimization Approach to 1-Bit Matrix Completion

arXiv.org Machine LearningSep-23-2013

Matrix completion, which aims to recover a low-rank matrix from a subset of its entries, has been an active area of research in the last few years. It has a range of successful applications. In some real-life situations, however, the observations are highly quantized, sometimes even to a single bit and thus the standard matrix completion techniques do not apply. Take the Netflix problem as an example, the observations are the ratings of movies, which are quantized to the set of integers from 1 to 5. In the more extreme case such as recommender systems, only a single bit of rating standing for a "thumbs up" or "thumbs down" is recorded at each occurrence. Another example of applications is targeted advertising, such as the relevance of advertisements on Hulu.

artificial intelligence, machine learning, optimization problem, (18 more...)

1309.6013

Country: North America > United States > Pennsylvania (0.28)

Genre: Research Report (1.00)

Industry:

Media > Television (0.68)
Media > Film (0.54)
Information Technology > Services (0.54)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)

Fercoq, Olivier, Richtárik, Peter

Smooth minimization of nonsmooth functions with parallel coordinate descent methods

arXiv.org Machine LearningSep-23-2013

We study the performance of a family of randomized parallel coordinate descent methods for minimizing the sum of a nonsmooth and separable convex functions. The problem class includes as a special case L1-regularized L1 regression and the minimization of the exponential loss ("AdaBoost problem"). We assume the input data defining the loss function is contained in a sparse $m\times n$ matrix $A$ with at most $\omega$ nonzeros in each row. Our methods need $O(n \beta/\tau)$ iterations to find an approximate solution with high probability, where $\tau$ is the number of processors and $\beta = 1 + (\omega-1)(\tau-1)/(n-1)$ for the fastest variant. The notation hides dependence on quantities such as the required accuracy and confidence levels and the distance of the starting iterate from an optimal point. Since $\beta/\tau$ is a decreasing function of $\tau$, the method needs fewer iterations when more processors are used. Certain variants of our algorithms perform on average only $O(\nnz(A)/n)$ arithmetic operations during a single iteration per processor and, because $\beta$ decreases when $\omega$ does, fewer iterations are needed for sparser problems.

artificial intelligence, descent method, machine learning, (19 more...)

1309.5885

Country: Europe > United Kingdom (0.28)

Genre: Research Report (0.82)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)

Mousavi, Ali, Maleki, Arian, Baraniuk, Richard G.

Asymptotic Analysis of LASSOs Solution Path with Implications for Approximate Message Passing

arXiv.org Machine LearningSep-23-2013

This paper concerns the performance of the LASSO (also knows as basis pursuit denoising) for recovering sparse signals from undersampled, randomized, noisy measurements. We consider the recovery of the signal $x_o \in \mathbb{R}^N$ from $n$ random and noisy linear observations $y= Ax_o + w$, where $A$ is the measurement matrix and $w$ is the noise. The LASSO estimate is given by the solution to the optimization problem $x_o$ with $\hat{x}_{\lambda} = \arg \min_x \frac{1}{2} \|y-Ax\|_2^2 + \lambda \|x\|_1$. Despite major progress in the theoretical analysis of the LASSO solution, little is known about its behavior as a function of the regularization parameter $\lambda$. In this paper we study two questions in the asymptotic setting (i.e., where $N \rightarrow \infty$, $n \rightarrow \infty$ while the ratio $n/N$ converges to a fixed number in $(0,1)$): (i) How does the size of the active set $\|\hat{x}_\lambda\|_0/N$ behave as a function of $\lambda$, and (ii) How does the mean square error $\|\hat{x}_{\lambda} - x_o\|_2^2/N$ behave as a function of $\lambda$? We then employ these results in a new, reliable algorithm for solving LASSO based on approximate message passing (AMP).

artificial intelligence, lasso, optimization problem, (16 more...)

1309.5979

Genre: Research Report (0.64)

Technology:

Information Technology > Architecture > Distributed Systems (0.61)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.34)

Ghadimi, Saeed, Lan, Guanghui

Stochastic First- and Zeroth-order Methods for Nonconvex Stochastic Programming

arXiv.org Machine LearningSep-21-2013

In this paper, we introduce a new stochastic approximation (SA) type algorithm, namely the randomized stochastic gradient (RSG) method, for solving an important class of nonlinear (possibly nonconvex) stochastic programming (SP) problems. We establish the complexity of this method for computing an approximate stationary point of a nonlinear programming problem. We also show that this method possesses a nearly optimal rate of convergence if the problem is convex. We discuss a variant of the algorithm which consists of applying a post-optimization phase to evaluate a short list of solutions generated by several independent runs of the RSG method, and show that such modification allows to improve significantly the large-deviation properties of the algorithm. These methods are then specialized for solving a class of simulation-based optimization problems in which only stochastic zeroth-order information is available.

artificial intelligence, machine learning, optimization problem, (17 more...)

1309.5549

Country: North America > United States (1.00)

Genre: Research Report (0.64)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Gradient Descent (0.35)

Flórez, Edson, Gómez, Wilfredo, Bautista, Lola

An ant colony optimization algorithm for job shop scheduling problem

arXiv.org Artificial IntelligenceSep-19-2013

The nature has inspired several metaheuristics, outstanding among these is Ant Colony Optimization (ACO), which have proved to be very effective and efficient in problems of high complexity (NP-hard) in combinatorial optimization. This paper describes the implementation of an ACO model algorithm known as Elitist Ant System (EAS), applied to a combinatorial optimization problem called Job Shop Scheduling Problem (JSSP). We propose a method that seeks to reduce delays designating the operation immediately available, but considering the operations that lack little to be available and have a greater amount of pheromone. The performance of the algorithm was evaluated for problems of JSSP reference, comparing the quality of the solutions obtained regarding the best known solution of the most effective methods. The solutions were of good quality and obtained with a remarkable efficiency by having to make a very low number of objective function evaluations.

artificial intelligence, evolutionary algorithm, machine learning, (13 more...)

arXiv.org Artificial Intelligence

1309.511

Country: North America > United States (0.68)

Genre: Research Report (0.40)

Industry: Health & Medicine (0.48)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Search (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Evolutionary Systems (1.00)