AITopics | Regression

In this article we consider the Conditional Super Learner (CSL), an algorithm which selects the best model candidate from a library conditional on the covariates. The CSL expands the idea of using cross-validation to select the best model and merges it with meta learning. Here we propose a specific algorithm that finds a local minimum to the problem posed, proof that it converges at a rate faster than Op(n^-1/4) and offers extensive empirical evidence that it is an excellent candidate to substitute stacking or for the analysis of Hierarchical problems.

algorithm, cross validation, validation, (14 more...)

arXiv.org Machine Learning

1912.06675

Country:

North America > United States > California > San Francisco County > San Francisco (0.28)
North America > United States > California > Alameda County > Berkeley (0.14)
North America > United States > New York (0.04)
North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)

Genre: Research Report > New Finding (0.46)

Industry: Health & Medicine > Pharmaceuticals & Biotechnology (0.60)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.68)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.46)

Add feedback

Understanding complex predictive models with Ghost Variables

Delicado, Pedro, Peña, Daniel

arXiv.org Machine LearningDec-13-2019

We propose a procedure for assigning a relevance measure to each explanatory variable in a complex predictive model. We assume that we have a training set to fit the model and a test set to check the out of sample performance. First, the individual relevance of each variable is computed by comparing the predictions in the test set, given by the model that includes all the variables with those of another model in which the variable of interest is substituted by its ghost variable, defined as the prediction of this variable by using the rest of explanatory variables. Second, we check the joint effects among the variables by using the eigenvalues of a relevance matrix that is the covariance matrix of the vectors of individual effects. It is shown that in simple models, as linear or additive models, the proposed measures are related to standard measures of significance of the variables and in neural networks models (and in other algorithmic prediction models) the procedure provides information about the joint and individual effects of the variables that is not usually available by other methods. The procedure is illustrated with simulated examples and the analysis of a large real data set.

ghost variable, matrix, relevance, (14 more...)

arXiv.org Machine Learning

1912.06407

Country:

Europe > Spain > Galicia > Madrid (0.04)
North America > United States > New York (0.04)

Genre:

Research Report > Experimental Study (0.67)
Research Report > New Finding (0.67)

Technology:

Information Technology > Modeling & Simulation (1.00)
Information Technology > Data Science > Data Mining (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.49)

Add feedback

MM Algorithms for Distance Covariance based Sufficient Dimension Reduction and Sufficient Variable Selection

Wu, Runxiong, Chen, Xin

arXiv.org Machine LearningDec-13-2019

Sufficient dimension reduction (SDR) using distance covariance (DCOV) was recently proposed as an approach to dimension-reduction problems. Compared with other SDR methods, it is model-free without estimating link function and does not require any particular distributions on predictors (see Sheng and Yin, 2013, 2016). However, the DCOV-based SDR method involves optimizing a nonsmooth and nonconvex objective function over the Stiefel manifold. To tackle the numerical challenge, we novelly formulate the original objective function equivalently into a DC (Difference of Convex functions) program and construct an iterative algorithm based on the majorization-minimization (MM) principle. At each step of the MM algorithm, we inexactly solve the quadratic subproblem on the Stiefel manifold by taking one iteration of Riemannian Newton's method. The algorithm can also be readily extended to sufficient variable selection (SVS) using distance covariance. We establish the convergence property of the proposed algorithm under some regularity conditions. Simulation studies show our algorithm drastically improves the computation efficiency and is robust across various settings compared with the existing method. Supplemental materials for this article are available.

algorithm, equation, log null 1, (14 more...)

arXiv.org Machine Learning

1912.06342

Country: Asia > China > Guangdong Province > Shenzhen (0.04)

Genre: Research Report (0.82)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Learning in High Dimensional Spaces (0.82)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.46)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.46)

Add feedback

Machine Learning Full Course Learn Machine Learning Machine Learning Tutorial Simplilearn

#artificialintelligenceDec-12-2019, 21:47:50 GMT

What is Logistic Regression - 56:04 16. What is Linear Regression - 59:35 17.

learning, machine learning, regression, (9 more...)

#artificialintelligence

Industry: Information Technology (0.32)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.69)

Add feedback

Machine Learning Full Course - Learn Machine Learning 10 Hours Machine Learning Tutorial Edureka

#artificialintelligenceDec-12-2019, 21:47:45 GMT

This Machine Learning Tutorial is ideal for both beginners as well as professionals who want to master Machine Learning Algorithms. Below are the topics covered in this Machine Learning Tutorial for Beginners video: 2:47 What is Machine Learning? Please share it in the comment section below and our experts will answer it for you. For more information, please write back to us at sales@edureka.in or call us at IND: 9606058406 / US: 18338555775 (toll-free).

Add feedback

Linear Regressions and Split Datasets Using Sklearn - WebSystemer.no

#artificialintelligenceDec-12-2019, 02:22:09 GMT

#artificialintelligence

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.40)

Add feedback

Estimation and HAC-based Inference for Machine Learning Time Series Regressions

Babii, Andrii, Ghysels, Eric, Striaukas, Jonas

arXiv.org Machine LearningDec-12-2019

Time series regression analysis in econometrics typically involves a framework relying on a set of mixing conditions to establish consistency and asymptotic normality of parameter estimates and HAC-type estimators of the residual long-run variances to conduct proper inference. This article introduces structured machine learning regressions for high-dimensional time series data using the aforementioned commonly used setting. To recognize the time series data structures we rely on the sparse-group LASSO estimator. We derive a new Fuk-Nagaev inequality for a class of $\tau$-dependent processes with heavier than Gaussian tails, nesting $\alpha$-mixing processes as a special case, and establish estimation, prediction, and inferential properties, including convergence rates of the HAC estimator for the long-run variance based on LASSO residuals. An empirical application to nowcasting US GDP growth indicates that the estimator performs favorably compared to other alternatives and that the text data can be a useful addition to more traditional numerical data.

covariate, estimator, regression, (15 more...)

arXiv.org Machine Learning

1912.06307

Country:

North America > United States > North Carolina > Orange County > Chapel Hill (0.14)
North America > United States > New York (0.04)
North America > United States > Illinois > Cook County > Chicago (0.04)
(2 more...)

Genre: Research Report > Experimental Study (0.87)

Industry:

Banking & Finance > Economy (1.00)
Government > Regional Government > North America Government > United States Government (0.67)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.88)

Add feedback

Diagnosing model misspecification and performing generalized Bayes' updates via probabilistic classifiers

Thomas, Owen, Corander, Jukka

arXiv.org Machine LearningDec-12-2019

Model misspecification is a long-standing enigma of the Bayesian inference framework as posteriors tend to get overly concentrated on ill-informed parameter values towards the large sample limit. Tempering of the likelihood has been established as a safer way to do updates from prior to posterior in the presence of model misspecification. At one extreme tempering can ignore the data altogether and at the other extreme it provides the standard Bayes' update when no misspecification is assumed to be present. However, it is an open issue how to best recognize misspecification and choose a suitable level of tempering without access to the true generating model. Here we show how probabilistic classifiers can be employed to resolve this issue. By training a probabilistic classifier to discriminate between simulated and observed data provides an estimate of the ratio between the model likelihood and the likelihood of the data under the unobserved true generative process, within the discriminatory abilities of the classifier. The expectation of the logarithm of a ratio with respect to the data generating process gives an estimation of the negative Kullback-Leibler divergence between the statistical generative model and the true generative distribution. Using a set of canonical examples we show that this divergence provides a useful misspecification diagnostic, a model comparison tool, and a method to inform a generalised Bayesian update in the presence of misspecification for likelihood-based models.

classifier, log ratio, misspecification, (14 more...)

arXiv.org Machine Learning

1912.0581

Country:

Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
Europe > Norway > Eastern Norway > Oslo (0.04)

Genre: Research Report > Experimental Study (0.52)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.68)

Add feedback

A Beginner's Guide to Machine Learning: What Aspiring Data Scientists Should Know - DZone AI

#artificialintelligenceDec-11-2019, 21:08:52 GMT

Before choosing a machine learning algorithm, it's important to know their characteristics to generate desired outputs and build smart systems. Data science is growing super fast. As the demand for AI-enabled solutions is increasing, delivering smarter systems for industries has become essential. And the correctness and efficiency through machine learning operations must be fulfilled to ensure the developed solutions complete all demands. Hence, applying machine learning algorithms on the given dataset to produce righteous results and train the intelligent system is one of the most essential steps from the entire process.

algorithm, learning, regression, (15 more...)

#artificialintelligence

Industry:

Health & Medicine > Pharmaceuticals & Biotechnology (0.40)
Health & Medicine > Consumer Health (0.40)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.34)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.32)

Add feedback

Large-scale Kernel Methods and Applications to Lifelong Robot Learning

Camoriano, Raffaello

arXiv.org Machine LearningDec-11-2019

As the size and richness of available datasets grow larger, the opportunities for solving increasingly challenging problems with algorithms learning directly from data grow at the same pace. Consequently, the capability of learning algorithms to work with large amounts of data has become a crucial scientific and technological challenge for their practical applicability. Hence, it is no surprise that large-scale learning is currently drawing plenty of research effort in the machine learning research community. In this thesis, we focus on kernel methods, a theoretically sound and effective class of learning algorithms yielding nonparametric estimators. Kernel methods, in their classical formulations, are accurate and efficient on datasets of limited size, but do not scale up in a cost-effective manner. Recent research has shown that approximate learning algorithms, for instance random subsampling methods like Nystr\"om and random features, with time-memory-accuracy trade-off mechanisms are more scalable alternatives. In this thesis, we provide analyses of the generalization properties and computational requirements of several types of such approximation schemes. In particular, we expose the tight relationship between statistics and computations, with the goal of tailoring the accuracy of the learning process to the available computational resources. Our results are supported by experimental evidence on large-scale datasets and numerical simulations. We also study how large-scale learning can be applied to enable accurate, efficient, and reactive lifelong learning for robotics. In particular, we propose algorithms allowing robots to learn continuously from experience and adapt to changes in their operational environment. The proposed methods are validated on the iCub humanoid robot in addition to other benchmarks.

classification, regularization parameter, steinwart and christmann, (16 more...)

arXiv.org Machine Learning

1912.05629

Country:

Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
North America > Canada > Ontario > Toronto (0.04)
North America > United States > Massachusetts (0.04)
(2 more...)

Genre: Research Report > New Finding (1.00)

Industry: Education > Educational Setting (0.87)

Technology:

Information Technology > Artificial Intelligence > Robots (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.92)
(4 more...)

Add feedback

Filters

Collaborating Authors

Regression

Conditional Super Learner

Understanding complex predictive models with Ghost Variables

MM Algorithms for Distance Covariance based Sufficient Dimension Reduction and Sufficient Variable Selection

Machine Learning Full Course Learn Machine Learning Machine Learning Tutorial Simplilearn

Machine Learning Full Course - Learn Machine Learning 10 Hours Machine Learning Tutorial Edureka

Linear Regressions and Split Datasets Using Sklearn - WebSystemer.no

Estimation and HAC-based Inference for Machine Learning Time Series Regressions

Diagnosing model misspecification and performing generalized Bayes' updates via probabilistic classifiers

A Beginner's Guide to Machine Learning: What Aspiring Data Scientists Should Know - DZone AI

Large-scale Kernel Methods and Applications to Lifelong Robot Learning