AITopics

1505.02294

Country:

North America > United States (0.14)
Europe > United Kingdom > England (0.14)
Asia > Middle East (0.14)

Genre: Research Report (0.64)

Technology:

Information Technology > Data Science > Data Mining (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)

arXiv.org Machine LearningFeb-2-2015

Generalized Dantzig Selector: Application to the k-support norm

Chatterjee, Soumyadeep, Chen, Sheng, Banerjee, Arindam

We propose a Generalized Dantzig Selector (GDS) for linear models, in which any norm encoding the parameter structure can be leveraged for estimation. We investigate both computational and statistical aspects of the GDS. Based on conjugate proximal operator, a flexible inexact ADMM framework is designed for solving GDS, and non-asymptotic high-probability bounds are established on the estimation error, which rely on Gaussian width of unit norm ball and suitable set encompassing estimation error. Further, we consider a non-trivial example of the GDS using $k$-support norm. We derive an efficient method to compute the proximal operator for $k$-support norm since existing methods are inapplicable in this setting. For statistical analysis, we provide upper bounds for the Gaussian widths needed in the GDS analysis, yielding the first statistical recovery guarantee for estimation with the $k$-support norm. The experimental results confirm our theoretical analysis.

artificial intelligence, k-support norm, machine learning, (14 more...)

1406.5291

Country: North America > United States (0.14)

Genre: Research Report (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (0.93)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.46)

Parallel Direction Method of Multipliers

Wang, Huahua, Banerjee, Arindam, Luo, Zhi-Quan

We consider the problem of minimizing block-separable convex functions subject to linear constraints. While the Alternating Direction Method of Multipliers (ADMM) for two-block linear constraints has been intensively studied both theoretically and empirically, in spite of some preliminary work, effective generalizations of ADMM to multiple blocks is still unclear. In this paper, we propose a parallel randomized block coordinate method named Parallel Direction Method of Multipliers (PDMM) to solve the optimization problems with multi-block linear constraints. PDMM randomly updates some blocks in parallel, behaving like parallel randomized block coordinate descent. We establish the global convergence and the iteration complexity for PDMM with constant step size. We also show that PDMM can do randomized block coordinate descent on overlapping blocks. Experimental results show that PDMM performs better than state-of-the-arts methods in two applications, robust principal component analysis and overlapping group lasso.

artificial intelligence, optimization problem, pdmm, (15 more...)

Country: North America > United States (0.46)

Genre: Research Report (0.68)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.66)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.48)

Estimation with Norm Regularization

Banerjee, Arindam, Chen, Sheng, Fazayeli, Farideh, Sivakumar, Vidyashankar

Analysis of estimation error and associated structured statistical recovery based on norm regularized regression, e.g., Lasso, needs to consider four aspects: the norm, the loss function, the design matrix, and the noise vector. This paper presents generalizations of such estimation error analysis on all four aspects, compared to the existing literature. We characterize the restricted error set, establish relations between error sets for the constrained and regularized problems, and present an estimation error bound applicable to {\em any} norm. Precise characterizations of the bound are presented for a variety of noise vectors, design matrices, including sub-Gaussian, anisotropic, and dependent samples, and loss functions, including least squares and generalized linear models. Gaussian widths, as a measure of size of suitable sets, and associated tools play a key role in our generalized analysis.

artificial intelligence, design matrix, machine learning, (16 more...)

Country: North America > United States (0.14)

Technology: Information Technology > Artificial Intelligence > Machine Learning (1.00)

Generalized Dantzig Selector: Application to the k-support norm

Chatterjee, Soumyadeep, Chen, Sheng, Banerjee, Arindam

We propose a Generalized Dantzig Selector (GDS) for linear models, in which any norm encoding the parameter structure can be leveraged for estimation. We investigate both computational and statistical aspects of the GDS. Based on conjugate proximal operator, a flexible inexact ADMM framework is designed for solving GDS. Thereafter, non-asymptotic high-probability bounds are established on the estimation error, which rely on Gaussian widths of the unit norm ball and the error set. Further, we consider a non-trivial example of the GDS using k-support norm. We derive an efficient method to compute the proximal operator for k-support norm since existing methods are inapplicable in this setting. For statistical analysis, we provide upper bounds for the Gaussian widths needed in the GDS analysis, yielding the first statistical recovery guarantee for estimation with the k-support norm. The experimental results confirm our theoretical analysis.

artificial intelligence, k-support norm, machine learning, (13 more...)

Country: North America > United States (0.14)

Genre: Research Report > New Finding (0.34)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.46)

Bregman Alternating Direction Method of Multipliers

Wang, Huahua, Banerjee, Arindam

The mirror descent algorithm (MDA) generalizes gradient descent by using a Bregman divergence to replace squared Euclidean distance. In this paper, we similarly generalize the alternating direction method of multipliers (ADMM) to Bregman ADMM (BADMM), which allows the choice of different Bregman divergences to exploit the structure of problems. BADMM provides a unified framework for ADMM and its variants, including generalized ADMM, inexact ADMM and Bethe ADMM. We establish the global convergence and the $O(1/T)$ iteration complexity for BADMM. In some cases, BADMM can be faster than ADMM by a factor of $O(n/\ln n)$ where $n$ is the dimensionality. In solving the linear program of mass transportation problem, BADMM leads to massive parallelism and can easily run on GPU. BADMM is several times faster than highly optimized commercial software Gurobi.

artificial intelligence, badmm, optimization problem, (15 more...)

Country:

North America > United States (0.14)
Europe (0.14)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.93)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.67)

arXiv.org Machine LearningSep-1-2014

Multi-task Sparse Structure Learning

Goncalves, Andre R., Das, Puja, Chatterjee, Soumyadeep, Sivakumar, Vidyashankar, Von Zuben, Fernando J., Banerjee, Arindam

Multi-task learning (MTL) aims to improve generalization performance by learning multiple related tasks simultaneously. While sometimes the underlying task relationship structure is known, often the structure needs to be estimated from data at hand. In this paper, we present a novel family of models for MTL, applicable to regression and classification problems, capable of learning the structure of task relationships. In particular, we consider a joint estimation problem of the task relationship structure and the individual task parameters, which is solved using alternating minimization. The task relationship structure learning component builds on recent advances in structure learning of Gaussian graphical models based on sparse estimators of the precision (inverse covariance) matrix. We illustrate the effectiveness of the proposed model on a variety of synthetic and benchmark datasets for regression and classification. We also consider the problem of combining climate model outputs for better projections of future climate, with focus on temperature in South America, and show that the proposed model outperforms several existing methods for the problem.

health & medicine, matrix, survey article, (16 more...)

doi: 10.1145/2661829.2662091

1409.0272

Country:

South America (1.00)
North America > United States (0.68)

Genre: Research Report (0.83)

Industry:

Health & Medicine (0.68)
Government > Regional Government (0.46)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models (1.00)

AAAI ConferencesJul-14-2014

Online Portfolio Selection with Group Sparsity

Das, Puja (University of Minnesota, Twin Cities) | Johnson, Nicholas (University of Minnesota, Twin Cities) | Banerjee, Arindam (University of Minnesota, Twin Cities)

In portfolio selection, it often might be preferable to focus on a few top performing industries/sectors to beat the market. These top performing sectors however might change over time. In this paper, we propose an online portfolio selection algorithm that can take advantage of sector information through the use of a group sparsity inducing regularizer while making lazy updates to the portfolio. The lazy updates prevent changing ones portfolio too often which otherwise might incur huge transaction costs. The proposed formulation leads to a non-smooth constrained optimization problem at every step, with the constraint that the solution has to lie in a probability simplex. We propose an efficient primal-dual based alternating direction method of multipliers algorithm and demonstrate its effectiveness for the problem of online portfolio selection with sector information. We show that our algorithm OLU-GS has sub-linear regret w.r.t. the best fixed and best shifting solution in hindsight. We successfully establish the robustness and scalability of OLU-GS by performing extensive experiments on two real-world datasets.

banking & finance, optimization problem, portfolio selection, (16 more...)

AAAI Conferences

Twenty-Eighth AAAI Conference on Artificial Intelligence

Country: North America > United States > Minnesota > Hennepin County > Minneapolis (0.28)

Industry:

Banking & Finance > Trading (1.00)
Energy (0.93)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.54)

arXiv.org Machine LearningJul-7-2014

Bregman Alternating Direction Method of Multipliers

Wang, Huahua, Banerjee, Arindam

The mirror descent algorithm (MDA) generalizes gradient descent by using a Bregman divergence to replace squared Euclidean distance. In this paper, we similarly generalize the alternating direction method of multipliers (ADMM) to Bregman ADMM (BADMM), which allows the choice of different Bregman divergences to exploit the structure of problems. BADMM provides a unified framework for ADMM and its variants, including generalized ADMM, inexact ADMM and Bethe ADMM. We establish the global convergence and the $O(1/T)$ iteration complexity for BADMM. In some cases, BADMM can be faster than ADMM by a factor of $O(n/\log(n))$. In solving the linear program of mass transportation problem, BADMM leads to massive parallelism and can easily run on GPU. BADMM is several times faster than highly optimized commercial software Gurobi.

artificial intelligence, divergence, optimization problem, (17 more...)

1306.3203

Genre: Research Report (0.50)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.49)

Neural Information Processing SystemsDec-31-2013

Large Scale Distributed Sparse Precision Estimation

Wang, Huahua, Banerjee, Arindam, Hsieh, Cho-Jui, Ravikumar, Pradeep K., Dhillon, Inderjit S.

We consider the problem of sparse precision matrix estimation in high dimensions using the CLIME estimator, which has several desirable theoretical properties. We present an inexact alternating direction method of multiplier (ADMM) algorithm for CLIME, and establish rates of convergence for both the objective and optimality conditions. Further, we develop a large scale distributed framework for the computations, which scales to millions of dimensions and trillions of parameters, using hundreds of cores. The proposed framework solves CLIME in column-blocks and only involves elementwise operations and parallel matrix multiplications. We evaluate our algorithm on both shared-memory and distributed-memory architectures, which can use block cyclic distribution of data and parameters to achieve load balance and improve the efficiency in the use of memory hierarchies. Experimental results show that our algorithm is substantially more scalable than state-of-the-art methods and scales almost linearly with the number of cores.

artificial intelligence, matrix, optimization problem, (14 more...)

Country: North America > United States > Texas (0.14)

Genre: Research Report > New Finding (0.34)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.94)
Information Technology > Artificial Intelligence > Representation & Reasoning > Mathematical & Statistical Methods (0.68)