AITopics | Regression

Collaborating Authors

Regression

News Overviews Instructional Materials AI-Alerts Classics

Efficient sampling for Gaussian linear regression with arbitrary priors

Hahn, P. Richard, He, Jingyu, Lopes, Hedibert

arXiv.org Machine LearningJun-14-2018

This paper develops a computationally efficient posterior sampling algorithm for Bayesian linear regression models with Gaussian errors. Our new approach is motivated by the fact that existing software implementations for Bayesian linear regression do not readily handle problems with large number of observations (hundreds of thousands) and predictors (thousands). Moreover, existing sampling algorithms for popular shrinkage priors are bespoke Gibbs samplers based on case-specific latent variable representations. By contrast, the new algorithm does not rely on case-specific auxiliary variable representations, which allows for rapid prototyping of novel shrinkage priors outside the conditionally Gaussian framework. Specifically, we propose a slice-within-Gibbs sampler based on the elliptical slice sampler of Murray et al. [2010].

artificial intelligence, machine learning, sampler, (16 more...)

arXiv.org Machine Learning

1806.05738

Country:

Europe > Austria > Vienna (0.14)
South America > Brazil (0.04)
Oceania > New Zealand (0.04)
(4 more...)

Genre: Research Report (1.00)

Industry: Education (0.93)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.46)

Add feedback

Data-Driven Decentralized Optimal Power Flow

Dobbe, Roel, Sondermeijer, Oscar, Fridovich-Keil, David, Arnold, Daniel, Callaway, Duncan, Tomlin, Claire

arXiv.org Artificial IntelligenceJun-14-2018

The implementation of optimal power flow (OPF) methods to perform voltage and power flow regulation in electric networks is generally believed to require communication. We consider distribution systems with multiple controllable Distributed Energy Resources (DERs) and present a data-driven approach to learn control policies for each DER to reconstruct and mimic the solution to a centralized OPF problem from solely locally available information. Collectively, all local controllers closely match the centralized OPF solution, providing near-optimal performance and satisfaction of system constraints. A rate distortion framework facilitates the analysis of how well the resulting fully decentralized control policies are able to reconstruct the OPF solution. Our methodology provides a natural extension to decide what buses a DER should communicate with to improve the reconstruction of its individual policy. The method is applied on both single- and three-phase test feeder networks using data from real loads and distributed generators. It provides a framework for Distribution System Operators to efficiently plan and operate the contributions of DERs to active distribution networks.

artificial intelligence, ieee transaction, machine learning, (18 more...)

arXiv.org Artificial Intelligence

1806.0679

Country:

North America > United States > Texas > Travis County > Austin (0.04)
North America > United States > Massachusetts > Suffolk County > Boston (0.04)
North America > United States > Hawaii (0.04)
(5 more...)

Genre: Research Report (0.82)

Industry:

Energy > Power Industry (1.00)
Energy > Renewable > Solar (0.68)
Transportation > Ground > Road (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.46)

Add feedback

Logistic Regression (Predictive Modeling) workshop using R

#artificialintelligenceJun-12-2018, 13:47:33 GMT

Get your team access to Udemy's top 2,500 courses anytime, anywhere. This course is a workshop on logistic regression using R.

logistic regression, machine learning, predictive modeling, (2 more...)

#artificialintelligence

Genre:

Research Report (1.00)
Instructional Material > Course Syllabus & Notes (1.00)

Industry:

Education > Educational Technology > Educational Software > Computer Based Training (0.82)
Education > Educational Setting > Online (0.82)

Technology:

Information Technology > Enterprise Applications > Human Resources > Learning Management (0.82)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.82)

Add feedback

Logistic Ensemble Models

Vanderheyden, Bob, Priestley, Jennifer

arXiv.org Machine LearningJun-12-2018

Predictive models that are developed in a regulated industry or a regulated application, like determination of credit worthiness, must be interpretable and rational (e.g., meaningful improvements in basic credit behavior must result in improved credit worthiness scores). Machine Learning technologies provide very good performance with minimal analyst intervention, making them well suited to a high volume analytic environment, but the majority are black box tools that provide very limited insight or interpretability into key drivers of model performance or predicted model output values. This paper presents a methodology that blends one of the most popular predictive statistical modeling methods for binary classification with a core model enhancement strategy found in machine learning. The resulting prediction methodology provides solid performance, from minimal analyst effort, while providing the interpretability and rationality required in regulated industries, as well as in other environments where interpretation of model parameters is required (e.g. businesses that require interpretation of models, to take action on them).

artificial intelligence, machine learning, predictor, (15 more...)

arXiv.org Machine Learning

1806.04555

Country:

North America > United States > New York (0.04)
North America > United States > New Jersey > Hudson County > Hoboken (0.04)

Genre:

Research Report > Experimental Study (0.97)
Research Report > New Finding (0.72)

Industry:

Banking & Finance > Credit (1.00)
Law (0.68)

Technology:

Information Technology > Data Science (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.99)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.95)

Add feedback

A Fast and Easy Regression Technique for k-NN Classification Without Using Negative Pairs

Shigeto, Yutaro, Shimbo, Masashi, Matsumoto, Yuji

arXiv.org Machine LearningJun-11-2018

This paper proposes an inexpensive way to learn an effective dissimilarity function to be used for $k$-nearest neighbor ($k$-NN) classification. Unlike Mahalanobis metric learning methods that map both query (unlabeled) objects and labeled objects to new coordinates by a single transformation, our method learns a transformation of labeled objects to new points in the feature space whereas query objects are kept in their original coordinates. This method has several advantages over existing distance metric learning methods: (i) In experiments with large document and image datasets, it achieves $k$-NN classification accuracy better than or at least comparable to the state-of-the-art metric learning methods. (ii) The transformation can be learned efficiently by solving a standard ridge regression problem. For document and image datasets, training is often more than two orders of magnitude faster than the fastest metric learning methods tested. This speed-up is also due to the fact that the proposed method eliminates the optimization over "negative" object pairs, i.e., objects whose class labels are different. (iii) The formulation has a theoretical justification in terms of reducing hubness in data.

artificial intelligence, dataset, machine learning, (14 more...)

arXiv.org Machine Learning

1806.03945

Country:

Asia > Middle East > Jordan (0.04)
Asia > Japan > Honshū > Kantō > Chiba Prefecture > Chiba (0.04)

Genre: Research Report (0.64)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.57)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.40)

Add feedback

Prediction Intervals for Machine Learning

#artificialintelligenceJun-9-2018, 15:21:59 GMT

A prediction interval is calculated as some combination of the estimated variance of the model and the variance of the outcome variable. Prediction intervals are easy to describe, but difficult to calculate in practice. In simple cases like linear regression, we can estimate the confidence interval directly. In the cases of nonlinear regression algorithms, such as artificial neural networks, it is a lot more challenging and requires the choice and implementation of specialized techniques. General techniques such as the bootstrap resampling method can be used, but are computationally expensive to calculate. The paper "A Comprehensive Review of Neural Network-based Prediction Intervals and New Advances" provides a reasonably recent study of prediction intervals for nonlinear models in the context of neural networks.

artificial intelligence, machine learning, prediction interval, (12 more...)

#artificialintelligence

Genre: Instructional Material > Course Syllabus & Notes (0.48)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.96)

Add feedback

Introducing Aardpfark: Exporting Spark ML Models to PFA

#artificialintelligenceJun-9-2018, 13:36:32 GMT

The common perception of machine learning is that it starts with data and ends with a model. In real-world production systems, the traditional data science and machine learning workflow of data preparation, feature engineering, and model selection, while important, is just one aspect. A critical missing piece is the deployment and management of models, as well as the integration between the model creation and deployment phases. This is particularly challenging in the case of deploying Apache Spark ML pipelines for low latency scoring. While MLlib's DataFrame API is powerful, elegant, and works well in batch scoring scenarios, it is relatively ill-suited to the needs of many real-time predictive applications, for two main reasons.

aardpfark, artificial intelligence, machine learning, (12 more...)

#artificialintelligence

Industry: Information Technology (0.40)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.32)

Add feedback

A hybrid econometric-machine learning approach for relative importance analysis: Food inflation

Malhotra, Akash

arXiv.org Machine LearningJun-9-2018

A measure of relative importance of variables is often desired by researchers when the explanatory aspects of econometric methods are of interest. To this end, the author briefly reviews the limitations of conventional econometrics in constructing a reliable measure of variable importance. The author highlights the relative stature of explanatory and predictive analysis in economics and the emergence of fruitful collaborations between econometrics and computer science. Learning lessons from both, the author proposes a hybrid approach based on conventional econometrics and advanced machine learning (ML) algorithms, which are otherwise, used in predictive analytics. The purpose of this article is two-fold, to propose a hybrid approach to assess relative importance and demonstrate its applicability in addressing policy priority issues with an example of food inflation in India, followed by a broader aim to introduce the possibility of conflation of ML and conventional econometrics to an audience of researchers in economics and social sciences, in general.

artificial intelligence, data mining, machine learning, (19 more...)

arXiv.org Machine Learning

1806.04517

Country:

Asia > India > NCT > New Delhi (0.04)
North America > United States > New York (0.04)
Asia > India > Maharashtra > Mumbai (0.04)
(8 more...)

Genre: Research Report > Experimental Study (0.47)

Industry:

Food & Agriculture > Agriculture (1.00)
Banking & Finance > Economy (1.00)
Government > Regional Government > Asia Government > India Government (0.69)
Energy (0.68)

Technology:

Information Technology > Data Science > Data Mining (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.46)

Add feedback

Curriculum Learning by Transfer Learning: Theory and Experiments with Deep Networks

Weinshall, Daphna, Cohen, Gad, Amir, Dan

arXiv.org Artificial IntelligenceJun-8-2018

We provide theoretical investigation of curriculum learning in the context of stochastic gradient descent when optimizing the convex linear regression loss. We prove that the rate of convergence of an ideal curriculum learning method is monotonically increasing with the difficulty of the examples. Moreover, among all equally difficult points, convergence is faster when using points which incur higher loss with respect to the current hypothesis. We then analyze curriculum learning in the context of training a CNN. We describe a method which infers the curriculum by way of transfer learning from another network, pre-trained on a different task. While this approach can only approximate the ideal curriculum, we observe empirically similar behavior to the one predicted by the theory, namely, a significant boost in convergence speed at the beginning of training. When the task is made more difficult, improvement in generalization performance is also observed. Finally, curriculum learning exhibits robustness against unfavorable conditions such as excessive regularization.

curriculum, curriculum learning, learning, (15 more...)

arXiv.org Artificial Intelligence

1802.03796

Country:

Asia > Middle East > Israel > Jerusalem District > Jerusalem (0.04)
North America > United States > New Jersey (0.04)
Europe > Sweden > Stockholm > Stockholm (0.04)
Europe > Italy > Calabria > Catanzaro Province > Catanzaro (0.04)

Genre: Research Report (0.40)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Transfer Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Gradient Descent (0.55)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.34)

Add feedback

Lightweight Stochastic Optimization for Minimizing Finite Sums with Infinite Data

Zheng, Shuai, Kwok, James T.

arXiv.org Machine LearningJun-7-2018

Variance reduction has been commonly used in stochastic optimization. It relies crucially on the assumption that the data set is finite. However, when the data are imputed with random noise as in data augmentation, the perturbed data set be- comes essentially infinite. Recently, the stochastic MISO (S-MISO) algorithm is introduced to address this expected risk minimization problem. Though it converges faster than SGD, a significant amount of memory is required. In this pa- per, we propose two SGD-like algorithms for expected risk minimization with random perturbation, namely, stochastic sample average gradient (SSAG) and stochastic SAGA (S-SAGA). The memory cost of SSAG does not depend on the sample size, while that of S-SAGA is the same as those of variance reduction methods on un- perturbed data. Theoretical analysis and experimental results on logistic regression and AUC maximization show that SSAG has faster convergence rate than SGD with comparable space requirement, while S-SAGA outperforms S-MISO in terms of both iteration complexity and storage.

artificial intelligence, lightweight stochastic optimization, machine learning, (14 more...)

arXiv.org Machine Learning

1806.02927

Genre: Research Report > New Finding (0.34)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.48)

Add feedback