AITopics

We present a new approach to sample from generic binary distributions, based on an exact Hamiltonian Monte Carlo algorithm applied to a piecewise continuous augmentation of the binary distribution of interest. An extension of this idea to distributions over mixtures of binary and continuous variables allows us to sample from posteriors of linear and probit regression models with spike-and-slab priors and truncated parameters. We illustrate the advantages of these algorithms in several examples in which they outperform the Metropolis or Gibbs samplers.

artificial intelligence, machine learning, sampler, (15 more...)

Country: North America > United States (0.14)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models (0.70)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.36)

Han, Shaobo, Liao, Xuejun, Carin, Lawrence

Integrated Non-Factorized Variational Inference

We present a non-factorized variational method for full posterior inference in Bayesian hierarchical models, with the goal of capturing the posterior variable dependencies via efficient and possibly parallel computation. Our approach unifies the integrated nested Laplace approximation (INLA) under the variational framework. The proposed method is applicable in more challenging scenarios than typically assumed by INLA, such as Bayesian Lasso, which is characterized by the non-differentiability of the $\ell_{1}$ norm arising from independent Laplace priors. We derive an upper bound for the Kullback-Leibler divergence, which yields a fast closed-form solution via decoupled optimization. Our method is a reliable analytic alternative to Markov chain Monte Carlo (MCMC), and it results in a tighter evidence lower bound than that of mean-field variational Bayes (VB) method.

approximation, artificial intelligence, machine learning, (17 more...)

Country: North America > United States (0.94)

Industry: Health & Medicine > Therapeutic Area (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.48)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.46)

Robust Sparse Principal Component Regression under the High Dimensional Elliptical Model

Han, Fang, Liu, Han

In this paper we focus on the principal component regression and its application to high dimension non-Gaussian data. The major contributions are in two folds. First, in low dimensions and under a double asymptotic framework where both the dimension $d$ and sample size $n$ can increase, by borrowing the strength from recent development in minimax optimal principal component estimation, we first time sharply characterize the potential advantage of classical principal component regression over least square estimation under the Gaussian model. Secondly, we propose and analyze a new robust sparse principal component regression on high dimensional elliptically distributed data. The elliptical distribution is a semiparametric generalization of the Gaussian, including many well known distributions such as multivariate Gaussian, rank-deficient Gaussian, $t$, Cauchy, and logistic. It allows the random vector to be heavy tailed and have tail dependence. These extra flexibilities make it very suitable for modeling finance and biomedical imaging data. Under the elliptical model, we prove that our method can estimate the regression coefficients in the optimal parametric rate and therefore is a good alternative to the Gaussian based methods. Experiments on synthetic and real world data are conducted to illustrate the empirical usefulness of the proposed method.

artificial intelligence, machine learning, principal component regression, (14 more...)

Country: North America > United States (0.46)

Industry:

Health & Medicine (0.88)
Banking & Finance > Trading (0.68)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.53)

Javanmard, Adel, Montanari, Andrea

Confidence Intervals and Hypothesis Testing for High-Dimensional Statistical Models

Fitting high-dimensional statistical models often requires the use of non-linear parameter estimation procedures. As a consequence, it is generally impossible to obtain an exact characterization of the probability distribution of the parameter estimates. This in turn implies that it is extremely challenging to quantify the `uncertainty' associated with a certain parameter estimate. Concretely, no commonly accepted procedure exists for computing classical measures of uncertainty and statistical significance as confidence intervals or p-values. We consider here a broad class of regression problems, and propose an efficient algorithm for constructing confidence intervals and p-values. The resulting confidence intervals have nearly optimal size. When testing for the null hypothesis that a certain parameter is vanishing, our method has nearly optimal power. Our approach is based on constructing a `de-biased' version of regularized M-estimators. The new construction improves over recent work in the field in that it does not assume a special structure on the design matrix. Furthermore, proofs are remarkably simple. We test our method on a diabetes prediction problem.

artificial intelligence, confidence interval, machine learning, (16 more...)

Country: North America > United States > California > Santa Clara County (0.14)

Genre: Research Report > Experimental Study (1.00)

Industry: Health & Medicine > Therapeutic Area > Endocrinology > Diabetes (0.35)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.70)

Nguyen, Viet-An, Ying, Jordan L., Resnik, Philip

Lexical and Hierarchical Topic Regression

Inspired by a two-level theory that unifies agenda setting and ideological framing, we propose supervised hierarchical latent Dirichlet allocation (SHLDA) which jointly captures documents' multi-level topic structure and their polar response variables. Our model extends the nested Chinese restaurant process to discover a tree-structured topic hierarchy and uses both per-topic hierarchical and per-word lexical regression parameters to model the response variables. Experiments in a political domain and on sentiment analysis tasks show that SHLDA improves predictive accuracy while adding a new dimension of insight into how topics under discussion are framed.

artificial intelligence, machine learning, natural language, (18 more...)

Country: North America > United States > Maryland > Prince George's County > College Park (0.14)

Industry:

Media (1.00)
Law (1.00)
Government > Regional Government > North America Government > United States Government (0.94)
(3 more...)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Discourse & Dialogue (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.46)

Lasso Screening Rules via Dual Polytope Projection

Wang, Jie, Zhou, Jiayu, Wonka, Peter, Ye, Jieping

Lasso is a widely used regression technique to find sparse representations. When the dimension of the feature space and the number of samples are extremely large, solving the Lasso problem remains challenging. To improve the efficiency of solving large-scale Lasso problems, El Ghaoui and his colleagues have proposed the SAFE rules which are able to quickly identify the inactive predictors, i.e., predictors that have $0$ components in the solution vector. Then, the inactive predictors or features can be removed from the optimization problem to reduce its scale. By transforming the standard Lasso to its dual form, it can be shown that the inactive predictors include the set of inactive constraints on the optimal dual solution. In this paper, we propose an efficient and effective screening rule via Dual Polytope Projections (DPP), which is mainly based on the uniqueness and nonexpansiveness of the optimal dual solution due to the fact that the feasible set in the dual space is a convex and closed polytope. Moreover, we show that our screening rule can be extended to identify inactive groups in group Lasso. To the best of our knowledge, there is currently no exact" screening rule for group Lasso. We have evaluated our screening rule using many real data sets. Results show that our rule is more effective to identify inactive predictors than existing state-of-the-art screening rules for Lasso."

artificial intelligence, machine learning, predictor, (16 more...)

Genre: Research Report > New Finding (0.48)

Industry: Health & Medicine (1.00)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.34)

Structured Learning via Logistic Regression

Domke, Justin

A successful approach to structured learning is to write the learning objective as a joint function of linear parameters and inference messages, and iterate between updates to each. This paper observes that if the inference problem is "smoothed" through the addition of entropy terms, for fixed messages, the learning objective reduces to a traditional (non-structured) logistic regression problem with respect to parameters. In these logistic regression problems, each training example has a bias term determined by the current set of messages. Based on this insight, the structured energy function can be extended from linear factors to any function class where an "oracle" exists to minimize a logistic loss.

artificial intelligence, machine learning, segmentation, (18 more...)

Genre:

Research Report > New Finding (0.92)
Research Report > Experimental Study (0.82)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Inductive Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.82)

Yang, Eunho, Ravikumar, Pradeep K.

Dirty Statistical Models

We provide a unified framework for the high-dimensional analysis of “superposition-structured” or “dirty” statistical models: where the model parameters are a “superposition” of structurally constrained parameters. We allow for any number and types of structures, and any statistical model. We consider the general class of $M$-estimators that minimize the sum of any loss function, and an instance of what we call a “hybrid” regularization, that is the infimal convolution of weighted regularization functions, one for each structural component. We provide corollaries showcasing our unified framework for varied statistical models such as linear regression, multiple regression and principal component analysis, over varied superposition structures.

artificial intelligence, machine learning, matrix, (16 more...)

Country: North America > United States > Texas (0.14)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.38)

Wager, Stefan, Wang, Sida, Liang, Percy S.

Dropout Training as Adaptive Regularization

Dropout and other feature noising schemes control overfitting by artificially corrupting the training data. For generalized linear models, dropout performs a form of adaptive regularization. Using this viewpoint, we show that the dropout regularizer is first-order equivalent to an $\LII$ regularizer applied after scaling the features by an estimate of the inverse diagonal Fisher information matrix. We also establish a connection to AdaGrad, an online learner, and find that a close relative of AdaGrad operates by repeatedly solving linear dropout-regularized problems. By casting dropout as regularization, we develop a natural semi-supervised algorithm that uses unlabeled data to create a better adaptive regularizer. We apply this idea to document classification tasks, and show that it consistently boosts the performance of dropout training, improving on state-of-the-art results on the IMDB reviews dataset.

artificial intelligence, machine learning, regression, (15 more...)

Country: North America > United States (0.46)

Genre: Research Report > New Finding (0.32)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.51)