AITopics | Regression

Collaborating Authors

Regression

News Overviews Instructional Materials AI-Alerts Classics

Machine Learning with C - Polynomial Regression (CPU)

@machinelearnbotApr-27-2018, 19:40:11 GMT

There are a lot of articles about how to use Python for solving Machine Learning problems, with this article I start series of materials on how to use modern C for solving same problems and which libraries can be used. I assume that readers are already familiar with Machine Learning concepts and will concentrate on programming issues only. The first part is about creating Polynomial Regression model with XTensor library. This is C library for numerical analysis with multi-dimensional array expressions, and containers of XTensor are inspired by NumPy. A lot of functions in this library also have semantic similar to NumPy.so should be easier to start with this library rather then with Eigen or ViennaCL if you already familiar with NumPy.

artificial intelligence, library, machine learning, (11 more...)

@machinelearnbot

Country: Europe > Austria > Vienna (0.27)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.39)

Add feedback

A Demo of Hierarchical, Moderated, Multiple Regression Analysis in R

#artificialintelligenceApr-27-2018, 14:31:36 GMT

Moderator models are often used to examine when an independent variable influences a dependent variable. More specifically, moderators are used to identify factors that change the relationship between independent (X) and dependent (Y) variables. In this article, I explain how moderation in regression works, and then demonstrate how to do a hierarchical, moderated, multiple regression analysis in R. Hierarchical, moderated, multiple regression analysis in R can get pretty complicated so let's start at the very beginning. Y is the dependent variable whereas the variable X is independent i.e. the regression model tries to explain the causality between the two variables. The above equation has a single independent variable.

artificial intelligence, machine learning, threat, (14 more...)

#artificialintelligence

Genre:

Research Report > Experimental Study (1.00)
Research Report > New Finding (0.95)

Industry: Health & Medicine (0.52)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (1.00)

Add feedback

Auto-Detection of Safety Issues in Baby Products

Bleaney, Graham, Kuzyk, Matthew, Man, Julian, Mayanloo, Hossein, Tizhoosh, H. R.

arXiv.org Machine LearningApr-27-2018

Every year, thousands of people receive consumer product related injuries. Research indicates that online customer reviews can be processed to autonomously identify product safety issues. Early identification of safety issues can lead to earlier recalls, and thus fewer injuries and deaths. A dataset of product reviews from Amazon.com was compiled, along with \emph{SaferProducts.gov} complaints and recall descriptions from the Consumer Product Safety Commission (CPSC) and European Commission Rapid Alert system. A system was built to clean the collected text and to extract relevant features. Dimensionality reduction was performed by computing feature relevance through a Random Forest and discarding features with low information gain. Various classifiers were analyzed, including Logistic Regression, SVMs, Na{\"i}ve-Bayes, Random Forests, and an Ensemble classifier. Experimentation with various features and classifier combinations resulted in a logistic regression model with 70.2\% precision in the top 50 reviews surfaced. This classifier outperforms all benchmarks set by related literature and consumer product safety professionals.

artificial intelligence, machine learning, safety issue, (17 more...)

arXiv.org Machine Learning

1805.09772

Country:

North America > United States (0.34)
North America > Canada > Ontario > Waterloo Region > Waterloo (0.14)

Genre: Research Report > New Finding (1.00)

Industry:

Government > Regional Government (0.55)
Consumer Products & Services > Personal Products > Beauty Care Products (0.42)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.75)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.50)

Add feedback

The phase transition for the existence of the maximum likelihood estimate in high-dimensional logistic regression

Candes, Emmanuel J., Sur, Pragya

arXiv.org Machine LearningApr-25-2018

This paper rigorously establishes that the existence of the maximum likelihood estimate (MLE) in high-dimensional logistic regression models with Gaussian covariates undergoes a sharp `phase transition'. We introduce an explicit boundary curve $h_{\text{MLE}}$, parameterized by two scalars measuring the overall magnitude of the unknown sequence of regression coefficients, with the following property: in the limit of large sample sizes $n$ and number of features $p$ proportioned in such a way that $p/n \rightarrow \kappa$, we show that if the problem is sufficiently high dimensional in the sense that $\kappa > h_{\text{MLE}}$, then the MLE does not exist with probability one. Conversely, if $\kappa < h_{\text{MLE}}$, the MLE asymptotically exists with probability one.

artificial intelligence, machine learning, mle, (16 more...)

arXiv.org Machine Learning

1804.09753

Country: North America > United States (0.28)

Genre:

Research Report > New Finding (0.49)
Research Report > Experimental Study (0.35)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.71)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.71)

Add feedback

Is it possible to retrieve soil-moisture content from measured VNIR hyperspectral data?

Keller, Sina, Riese, Felix M., Stötzer, Johanna, Maier, Philipp M., Hinz, Stefan

arXiv.org Machine LearningApr-24-2018

In this paper, we investigate the potential of estimating the soil-moisture content based on VNIR hyperspectral data combined with IR data. Measurements from a multi-sensor field campaign represent the benchmark dataset which contains measured hyperspectral, IR, and soil-moisture data. We introduce a regression framework with three steps consisting of feature selection, preprocessing, and well-chosen regression models. The latter are mainly supervised machine learning models. An exception are the self-organizing maps which are a combination of unsupervised and supervised learning. We analyze the impact of the distinct preprocessing methods on the regression results. Of all regression models, the extremely randomized trees model without preprocessing provides the best estimation performance. Our results reveal the potential of the respective regression framework combined with the VNIR hyperspectral data to estimate soil moisture. In conclusion, the results of this paper provide a basis for further improvements in different research directions.

artificial intelligence, machine learning, soil moisture, (18 more...)

arXiv.org Machine Learning

1804.09046

Country: North America > United States (0.46)

Genre: Research Report > New Finding (0.34)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.89)

Add feedback

Between hard and soft thresholding: optimal iterative thresholding algorithms

Liu, Haoyang, Barber, Rina Foygel

arXiv.org Machine LearningApr-24-2018

Iterative thresholding algorithms seek to optimize a differentiable objective function over a sparsity or rank constraint by alternating between gradient steps that reduce the objective, and thresholding steps that enforce the constraint. This work examines the choice of the thresholding operator, and asks whether it is possible to achieve stronger guarantees than what is possible with hard thresholding. We develop the notion of relative concavity of a thresholding operator, a quantity that characterizes the convergence performance of any thresholding operator on the target optimization problem. Surprisingly, we find that commonly used thresholding operators, such as hard thresholding and soft thresholding, are suboptimal in terms of convergence guarantees. Instead, a general class of thresholding operators, lying between hard thresholding and soft thresholding, is shown to be optimal with the strongest possible convergence guarantee among all thresholding operators. Examples of this general class includes $\ell_q$ thresholding with appropriate choices of $q$, and a newly defined {\em reciprocal thresholding} operator. As a byproduct of the improved convergence guarantee, these new thresholding operators improve on the best known upper bound for prediction error of both iterative hard thresholding and Lasso in terms of the dependence on condition number in the setting of sparse linear regression.

artificial intelligence, machine learning, relative concavity, (18 more...)

arXiv.org Machine Learning

1804.08841

Country: North America > United States (0.28)

Genre: Research Report (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.48)

Add feedback

IBM Research Cracks Code on Accelerating Key Machine Learning Algorithms

#artificialintelligenceApr-23-2018, 21:22:09 GMT

Deep learning is well known to be very amenable to GPU acceleration. Accelerating "traditional" machine learning methods like logistic regression, linear regression, and support vector machines with GPUs at scale, has, however, been challenging. Today I am very proud to share a major breakthrough that IBM Research has made in this critical area. A team out of our Zurich IBM Research lab beat a previous performance benchmark set for a machine learning workload by Google by 46 times. The research team trained a logistic regression classifier to predict clicks on advertisements using a Terabyte-scale data set that consists of online advertising click-thru data, containing 4.2 billion training examples and 1 million features.

artificial intelligence, machine learning, regression, (12 more...)

#artificialintelligence

Country: Europe > Switzerland > Zürich > Zürich (0.25)

Genre: Research Report > New Finding (0.85)

Industry: Information Technology > Services (0.36)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (1.00)

Add feedback

A Theory of Statistical Inference for Ensuring the Robustness of Scientific Results

Coker, Beau, Rudin, Cynthia, King, Gary

arXiv.org Machine LearningApr-23-2018

Inference is the process of using facts we know to learn about facts we do not know. A theory of inference gives assumptions necessary to get from the former to the latter, along with a definition for and summary of the resulting uncertainty. Any one theory of inference is neither right nor wrong, but merely an axiom that may or may not be useful. Each of the many diverse theories of inference can be valuable for certain applications. However, no existing theory of inference addresses the tendency to choose, from the range of plausible data analysis specifications consistent with prior evidence, those that inadvertently favor one's own hypotheses. Since the biases from these choices are a growing concern across scientific fields, and in a sense the reason the scientific community was invented in the first place, we introduce a new theory of inference designed to address this critical problem. We derive "hacking intervals," which are the range of a summary statistic one may obtain given a class of possible endogenous manipulations of the data. Hacking intervals require no appeal to hypothetical data sets drawn from imaginary superpopulations. A scientific result with a small hacking interval is more robust to researcher manipulation than one with a larger interval, and is often easier to interpret than a classical confidence interval. Some versions of hacking intervals turn out to be equivalent to classical confidence intervals, which means they may also provide a more intuitive and potentially more useful interpretation of classical confidence intervals

artificial intelligence, data mining, machine learning, (18 more...)

arXiv.org Machine Learning

1804.08646

Country: North America > United States (0.46)

Genre:

Research Report > Experimental Study (1.00)
Research Report > New Finding (0.67)

Industry:

Law > Criminal Law (1.00)
Law Enforcement & Public Safety > Crime Prevention & Enforcement (1.00)
Health & Medicine > Pharmaceuticals & Biotechnology (1.00)
Health & Medicine > Therapeutic Area > Oncology (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Data Science > Data Mining (0.92)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.68)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.46)

Add feedback

A machine learning model for identifying cyclic alternating patterns in the sleeping brain

Chindhade, Aditya, Alshi, Abhijeet, Bhatia, Aakash, Dabhadkar, Kedar, Menon, Pranav Sivadas

arXiv.org Artificial IntelligenceApr-23-2018

Electroencephalography (EEG) is a method to record the electrical signals in the brain. Recognizing the EEG patterns in the sleeping brain gives insights into the sleeping disorders. The dataset uploaded under consideration contains data points associated to numerous physiologies. There are particular patterns associated with the Non-Rapid Eye Movement (NREM) sleep cycle of the brain. This study attempts to generalize the detection of these patterns using a machine learning model. The proposed model uses additional feature engineering to incorporate sequential information for training a classifier to predict the occurrence of Cyclic Alternating Pattern (CAP) sequences in the sleep cycle, which are often associate with sleep disorders.

artificial intelligence, machine learning, sequence, (18 more...)

arXiv.org Artificial Intelligence

1804.0875

Genre: Research Report (0.86)

Industry: Health & Medicine > Therapeutic Area > Neurology (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.48)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.30)

Add feedback

Sparse Travel Time Estimation from Streaming Data

Jabari, Saif Eddin, Freris, Nikolaos M., Dilip, Deepthi Mary

arXiv.org Machine LearningApr-22-2018

We address two shortcomings in online travel time estimation methods for congested urban traffic. The first shortcoming is related to the determination of the number of mixture modes, which can change dynamically, within day and from day to day. The second shortcoming is the wide-spread use of Gaussian probability densities as mixture components. Gaussian densities fail to capture the positive skew in travel time distributions and, consequently, large numbers of mixture components are needed for reasonable fitting accuracy when applied as mixture components. They also assign positive probabilities to negative travel times. To address these issues, this paper develops a mixture distribution with asymmetric components supported on the positive numbers. We use sparse estimation techniques to ensure parsimonious models. Specifically, we derive a novel generalization of Gamma mixture densities using Mittag-Leffler functions, which provides enhanced fitting flexibility and improved parsimony. In order to accommodate within-day variability and allow for online implementation of the proposed methodology (i.e., fast computations on streaming travel time data), we introduce a recursive algorithm which efficiently updates the fitted distribution whenever new data become available. Experimental results using real-world travel time data illustrate the efficacy of the proposed methods.

artificial intelligence, machine learning, mixture component, (17 more...)

arXiv.org Machine Learning

1804.0813

Country:

North America > United States > New York (0.14)
Asia > Middle East > UAE (0.14)

Genre: Research Report > New Finding (0.46)

Industry:

Transportation > Infrastructure & Services (1.00)
Transportation > Ground > Road (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Data Science (0.93)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.46)

Add feedback