AITopics | Banerjee, Arindam

Plotting

Banerjee, Arindam

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

High-Dimensional Dependency Structure Learning for Physical Processes

Golmohammadi, Jamal, Ebert-Uphoff, Imme, He, Sijie, Deng, Yi, Banerjee, Arindam

arXiv.org Machine LearningSep-12-2017

In this paper, we consider the use of structure learning methods for probabilistic graphical models to identify statistical dependencies in high-dimensional physical processes. Such processes are often synthetically characterized using PDEs (partial differential equations) and are observed in a variety of natural phenomena, including geoscience data capturing atmospheric and hydrological phenomena. Classical structure learning approaches such as the PC algorithm and variants are challenging to apply due to their high computational and sample requirements. Modern approaches, often based on sparse regression and variants, do come with finite sample guarantees, but are usually highly sensitive to the choice of hyper-parameters, e.g., parameter $\lambda$ for sparsity inducing constraint or regularization. In this paper, we present ACLIME-ADMM, an efficient two-step algorithm for adaptive structure learning, which estimates an edge specific parameter $\lambda_{ij}$ in the first step, and uses these parameters to learn the structure in the second step. Both steps of our algorithm use (inexact) ADMM to solve suitable linear programs, and all iterations can be done in closed form in an efficient block parallel manner. We compare ACLIME-ADMM with baselines on both synthetic data simulated by partial differential equations (PDEs) that model advection-diffusion processes, and real data (50 years) of daily global geopotential heights to study information flow in the atmosphere. ACLIME-ADMM is shown to be efficient, stable, and competitive, usually better than the baselines especially on difficult problems. On real data, ACLIME-ADMM recovers the underlying structure of global atmospheric circulation, including switches in wind directions at the equator and tropics entirely from the data.

algorithm, artificial intelligence, machine learning, (18 more...)

arXiv.org Machine Learning

1709.03891

Country: North America > United States (0.46)

Genre:

Research Report (0.64)
Workflow (0.48)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.46)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.46)

Add feedback

R2N2: Residual Recurrent Neural Networks for Multivariate Time Series Forecasting

Goel, Hardik, Melnyk, Igor, Banerjee, Arindam

arXiv.org Machine LearningSep-10-2017

Multivariate time-series modeling and forecasting is an important problem with numerous applications. Traditional approaches such as VAR (vector auto-regressive) models and more recent approaches such as RNNs (recurrent neural networks) are indispensable tools in modeling time-series data. In many multivariate time series modeling problems, there is usually a significant linear dependency component, for which VARs are suitable, and a nonlinear component, for which RNNs are suitable. Modeling such times series with only VAR or only RNNs can lead to poor predictive performance or complex models with large training times. In this work, we propose a hybrid model called R2N2 (Residual RNN), which first models the time series with a simple linear model (like VAR) and then models its residual errors using RNNs. R2N2s can be trained using existing algorithms for VARs and RNNs. Through an extensive empirical evaluation on two real world datasets (aviation and climate domains), we show that R2N2 is competitive, usually better than VAR or RNN, used alone. We also show that R2N2 is faster to train as compared to an RNN, while requiring less number of hidden units.

air transportation, deep learning, prediction, (20 more...)

arXiv.org Machine Learning

1709.03159

Country: North America > United States (0.46)

Genre: Research Report (0.64)

Industry:

Transportation > Air (0.70)
Health & Medicine (0.46)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

High Dimensional Structured Superposition Models

Gu, Qilong, Banerjee, Arindam

arXiv.org Machine LearningMay-30-2017

High dimensional superposition models characterize observations using parameters which can be written as a sum of multiple component parameters, each with its own structure, e.g., sum of low rank and sparse matrices, sum of sparse and rotated sparse vectors, etc. In this paper, we consider general superposition models which allow sum of any number of component parameters, and each component structure can be characterized by any norm. We present a simple estimator for such models, give a geometric condition under which the components can be accurately estimated, characterize sample complexity of the estimator, and give high probability non-asymptotic bounds on the componentwise estimation error. We use tools from empirical processes and generic chaining for the statistical analysis, and our results, which substantially generalize prior work on superposition models, are in terms of Gaussian widths of suitable sets.

artificial intelligence, machine learning, sc condition, (17 more...)

arXiv.org Machine Learning

1705.10886

Country:

North America > United States (0.14)
Europe > United Kingdom > England (0.14)

Genre: Research Report > New Finding (0.48)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (0.67)

Add feedback

Recommendation under Capacity Constraints

Christakopoulou, Konstantina, Kawale, Jaya, Banerjee, Arindam

arXiv.org Machine LearningMar-12-2017

In this paper, we investigate the common scenario where every candidate item for recommendation is characterized by a maximum capacity, i.e., number of seats in a Point-of-Interest (POI) or size of an item's inventory. Despite the prevalence of the task of recommending items under capacity constraints in a variety of settings, to the best of our knowledge, none of the known recommender methods is designed to respect capacity constraints. To close this gap, we extend three state-of-the art latent factor recommendation approaches: probabilistic matrix factorization (PMF), geographical matrix factorization (GeoMF), and bayesian personalized ranking (BPR), to optimize for both recommendation accuracy and expected item usage that respects the capacity constraints. We introduce the useful concepts of user propensity to listen and item capacity. Our experimental results in real-world datasets, both for the domain of item recommendation and POI recommendation, highlight the benefit of our method for the setting of recommendation under capacity constraints.

artificial intelligence, machine learning, recommendation, (17 more...)

arXiv.org Machine Learning

1701.05228

Country: North America > United States > California > San Francisco County > San Francisco (0.14)

Genre: Research Report (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.93)
Information Technology > Artificial Intelligence > Representation & Reasoning > Personal Assistant Systems (0.69)

Add feedback

Spatial Projection of Multiple Climate Variables Using Hierarchical Multitask Learning

Goncalves, Andre R. (Center for Research and Development in Telecommunication (CPqD)) | Banerjee, Arindam (University of Minnesota - Twin Cities) | Zuben, Fernando J. Von (University of Campinas)

AAAI ConferencesFeb-14-2017

Future projection of climate is typically obtained by combining outputs from multiple Earth System Models (ESMs) for several climate variables such as temperature and precipitation. While IPCC has traditionally used a simple model output average, recent work has illustrated potential advantages of using a multitask learning (MTL) framework for projections of individual climate variables. In this paper we introduce a framework for hierarchical multitask learning (HMTL) with two levels of tasks such that each super-task, i.e., task at the top level, is itself a multitask learning problem over sub-tasks. For climate projections, each super-task focuses on projections of specific climate variables spatially using an MTL formulation. For the proposed HMTL approach, a group lasso regularization is added to couple parameters across the super-tasks, which in the climate context helps exploit relationships among the behavior of different climate variables at a given spatial location. We show that some recent works on MTL based on learning task dependency structures can be viewed as special cases of HMTL. Experiments on synthetic and real climate data show that HMTL produces better results than decoupled MTL methods applied separately on the super-tasks and HMTL significantly outperforms baselines for climate projection.

artificial intelligence, projection, survey article, (18 more...)

AAAI Conferences

Thirty-First AAAI Conference on Artificial Intelligence

Country:

South America (0.68)
North America > United States (0.28)

Genre: Research Report > New Finding (0.46)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Transfer Learning (1.00)

Add feedback

Spatial Projection of Multiple Climate Variables using Hierarchical Multitask Learning

Gonçalves, André R., Banerjee, Arindam, Von Zuben, Fernando J.

arXiv.org Machine LearningJan-30-2017

artificial intelligence, climate variable, survey article, (16 more...)

arXiv.org Machine Learning

1701.0884

Country:

South America (0.94)
North America > United States > California > San Francisco County > San Francisco (0.14)

Genre: Research Report > New Finding (0.46)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Transfer Learning (1.00)

Add feedback

High Dimensional Structured Superposition Models

Gu, Qilong, Banerjee, Arindam

Neural Information Processing SystemsDec-31-2016

High dimensional superposition models characterize observations using parameters which can be written as a sum of multiple component parameters, each with its own structure, e.g., sum of low rank and sparse matrices, sum of sparse and rotated sparse vectors, etc. In this paper, we consider general superposition models which allow sum of any number of component parameters, and each component structure can be characterized by any norm. We present a simple estimator for such models, give a geometric condition under which the components can be accurately estimated, characterize sample complexity of the estimator, and give high probability nonasymptotic boundson the componentwise estimation error. We use tools from empirical processes and generic chaining for the statistical analysis, and our results, which substantially generalize prior work on superposition models, are in terms of Gaussian widths of suitable sets.

artificial intelligence, machine learning, sc condition, (16 more...)

Neural Information Processing Systems

Country:

North America > United States (0.14)
Europe > Spain (0.14)

Genre: Research Report > New Finding (0.48)

Technology: Information Technology > Artificial Intelligence > Machine Learning (0.93)

Add feedback

Structured Matrix Recovery via the Generalized Dantzig Selector

Chen, Sheng, Banerjee, Arindam

Neural Information Processing SystemsDec-31-2016

In recent years, structured matrix recovery problems have gained considerable attention for its real world applications, such as recommender systems and computer vision. Much of the existing work has focused on matrices with low-rank structure, and limited progress has been made on matrices with other types of structure. In this paper we present non-asymptotic analysis for estimation of generally structured matrices via the generalized Dantzig selector based on sub-Gaussian measurements. We show that the estimation error can always be succinctly expressed in terms of a few geometric measures such as Gaussian widths of suitable sets associated with the structure of the underlying true matrix. Further, we derive general bounds on these geometric measures for structures characterized by unitarily invariant norms, a large family covering most matrix norms of practical interest. Examples are provided to illustrate the utility of our theoretical development.

artificial intelligence, machine learning, matrix, (16 more...)

Neural Information Processing Systems

Country:

North America > United States (0.14)
Europe > Spain (0.14)

Technology: Information Technology > Artificial Intelligence > Machine Learning (0.93)

Add feedback

The Matrix Generalized Inverse Gaussian Distribution: Properties and Applications

Fazayeli, Farideh, Banerjee, Arindam

arXiv.org Machine LearningAug-22-2016

While the Matrix Generalized Inverse Gaussian ($\mathcal{MGIG}$) distribution arises naturally in some settings as a distribution over symmetric positive semi-definite matrices, certain key properties of the distribution and effective ways of sampling from the distribution have not been carefully studied. In this paper, we show that the $\mathcal{MGIG}$ is unimodal, and the mode can be obtained by solving an Algebraic Riccati Equation (ARE) equation [7]. Based on the property, we propose an importance sampling method for the $\mathcal{MGIG}$ where the mode of the proposal distribution matches that of the target. The proposed sampling method is more efficient than existing approaches [32, 33], which use proposal distributions that may have the mode far from the $\mathcal{MGIG}$'s mode. Further, we illustrate that the the posterior distribution in latent factor models, such as probabilistic matrix factorization (PMF) [25], when marginalized over one latent factor has the $\mathcal{MGIG}$ distribution. The characterization leads to a novel Collapsed Monte Carlo (CMC) inference algorithm for such latent factor models. We illustrate that CMC has a lower log loss or perplexity than MCMC, and needs fewer samples.

bayesian inference, health & medicine, proposal distribution, (19 more...)

arXiv.org Machine Learning

1604.03463

Country: North America > United States (0.28)

Genre: Research Report (0.40)

Industry: Health & Medicine > Pharmaceuticals & Biotechnology (0.69)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.68)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.46)

Add feedback

Alternating Estimation for Structured High-Dimensional Multi-Response Models

Chen, Sheng, Banerjee, Arindam

arXiv.org Machine LearningJun-29-2016

We consider learning high-dimensional multi-response linear models with structured parameters. By exploiting the noise correlations among responses, we propose an alternating estimation (AltEst) procedure to estimate the model parameters based on the generalized Dantzig selector. Under suitable sample size and resampling assumptions, we show that the error of the estimates generated by AltEst, with high probability, converges linearly to certain minimum achievable level, which can be tersely expressed by a few geometric measures, such as Gaussian width of sets related to the parameter structure. To the best of our knowledge, this is the first non-asymptotic statistical guarantee for such AltEst-type algorithm applied to estimation problem with general structures.

artificial intelligence, machine learning, probability, (18 more...)

arXiv.org Machine Learning

1606.08957

Country: North America > United States (0.14)

Genre: Research Report (0.50)

Technology: Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback