AITopics | Regression

Collaborating Authors

Regression

News Overviews Instructional Materials AI-Alerts Classics

Vital Statistics You Never Learned… Because They're Never Taught

@machinelearnbotSep-4-2017, 22:20:07 GMT

KG: Starting from the beginning, what is statistics and how did it come about? Could you give us a short definition and history of the discipline? In a brief nutshell statistics began as a way to understand the workings of states, productivity, life expectancy, agricultural yields, etc., and to make estimates of things from samples (an statistical example of the latter dates back to the 5th century BCE in Athens). Concerning a definition for statistics, it is a field that is a science unto itself and that benefits all other fields and everyday life. What is unique about statistics is its proven tools for decision making in the face of uncertainty, understanding sources of variation and bias, and most importantly, statistical thinking.

artificial intelligence, machine learning, statistics, (18 more...)

@machinelearnbot

Country: North America > United States (0.15)

Genre: Research Report > Experimental Study (0.48)

Industry: Health & Medicine (0.99)

Technology:

Information Technology > Data Science (0.97)
Information Technology > Artificial Intelligence > Representation & Reasoning (0.72)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.30)

Add feedback

The Pragmatics of Indirect Commands in Collaborative Discourse

Lamm, Matthew, Eric, Mihail

arXiv.org Artificial IntelligenceSep-4-2017

Today's artificial assistants are typically prompted to perform tasks through direct, imperative commands such as \emph{Set a timer} or \emph{Pick up the box}. However, to progress toward more natural exchanges between humans and these assistants, it is important to understand the way non-imperative utterances can indirectly elicit action of an addressee. In this paper, we investigate command types in the setting of a grounded, collaborative game. We focus on a less understood family of utterances for eliciting agent action, locatives like \emph{The chair is in the other room}, and demonstrate how these utterances indirectly command in specific game state contexts. Our work shows that models with domain-specific grounding can effectively realize the pragmatic reasoning that is necessary for more robust natural language interaction.

machine learning, natural language, utterance, (18 more...)

arXiv.org Artificial Intelligence

1705.03454

Country:

North America > United States (0.47)
Europe > United Kingdom > England (0.14)

Genre: Research Report (0.51)

Industry: Leisure & Entertainment > Games (0.68)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (0.70)
Information Technology > Artificial Intelligence > Natural Language (0.70)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.47)

Add feedback

Linear regression in Python: Use of numpy, scipy, and statsmodels

@machinelearnbotSep-2-2017, 16:55:34 GMT

As we can see, the statsmodels library allows us to generate highly detailed output on a level similar to R, with additional statistics such as skew, kurtosis, R-Squared and AIC. While these readings can be generated through scipy or sklearn, doing so is a more intensive process and in many cases these statistics must be calculated individually.

artificial intelligence, library, machine learning, (12 more...)

@machinelearnbot

Genre: Research Report (0.72)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.75)

Add feedback

When can Multi-Site Datasets be Pooled for Regression? Hypothesis Tests, $\ell_2$-consistency and Neuroscience Applications

Zhou, Hao Henry, Zhang, Yilin, Ithapu, Vamsi K., Johnson, Sterling C., Wahba, Grace, Singh, Vikas

arXiv.org Machine LearningSep-2-2017

Many studies in biomedical and health sciences involve small sample sizes due to logistic or financial constraints. Often, identifying weak (but scientifically interesting) associations between a set of predictors and a response necessitates pooling datasets from multiple diverse labs or groups. While there is a rich literature in statistical machine learning to address distributional shifts and inference in multi-site datasets, it is less clear ${\it when}$ such pooling is guaranteed to help (and when it does not) -- independent of the inference algorithms we use. In this paper, we present a hypothesis test to answer this question, both for classical and high dimensional linear regression. We precisely identify regimes where pooling datasets across multiple sites is sensible, and how such policy decisions can be made via simple checks executable on each site before any data transfer ever happens. With a focus on Alzheimer's disease studies, we present empirical results showing that in regimes suggested by our analysis, pooling a local dataset with data from an international study improves power.

artificial intelligence, dataset, machine learning, (18 more...)

arXiv.org Machine Learning

1709.0064

Country: North America > United States (0.28)

Genre:

Research Report > New Finding (1.00)
Research Report > Experimental Study (0.88)

Industry: Health & Medicine > Therapeutic Area > Neurology > Alzheimer's Disease (0.70)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.49)

Add feedback

Adaptive Scaling

Li, Ting, Jing, Bingyi, Ying, Ningchen, Yu, Xianshi

arXiv.org Machine LearningSep-2-2017

Preprocessing data is an important step before any data analysis. In this paper, we focus on one particular aspect, namely scaling or normalization. We analyze various scaling methods in common use and study their effects on different statistical learning models. We will propose a new two-stage scaling method. First, we use some training data to fit linear regression model and then scale the whole data based on the coefficients of regression. Simulations are conducted to illustrate the advantages of our new scaling method. Some real data analysis will also be given.

artificial intelligence, machine learning, scaling, (17 more...)

arXiv.org Machine Learning

1709.00566

Genre: Research Report (0.83)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.88)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.49)

Add feedback

Data Science Simplified Part 9: Interactions and Limitations of Regression Models

#artificialintelligenceSep-1-2017, 10:35:23 GMT

The model predicts or estimates price (target) as a function of engine size, horse power, and width (predictors). Recall that multivariate regression model assumes independence between the independent predictors. It treats horsepower, engine size, and width as if they are not related. In practice, variables are rarely independent. This blog post will address this question.

artificial intelligence, machine learning, predictor, (12 more...)

#artificialintelligence

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (1.00)

Add feedback

Cross- Validation Code Visualization: Kind of Fun – Towards Data Science – Medium

@machinelearnbotAug-31-2017, 15:00:09 GMT

As the name of the suggests, cross-validation is the next fun thing after learning Linear Regression because it helps to improve your prediction using the K-Fold strategy. What is K-Fold you asked? Everything is explained below with Code. We are copying the target in dataset to y variable. To see the dataset uncomment the print line.

artificial intelligence, cross-validation code visualization, machine learning, (6 more...)

@machinelearnbot

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Cross Validation (0.72)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.48)

Add feedback

RANK: Large-Scale Inference with Graphical Nonlinear Knockoffs

Fan, Yingying, Demirkaya, Emre, Li, Gaorong, Lv, Jinchi

arXiv.org Machine LearningAug-31-2017

Power and reproducibility are key to enabling refined scientific discoveries in contemporary big data applications with general high-dimensional nonlinear models. In this paper, we provide theoretical foundations on the power and robustness for the model-free knockoffs procedure introduced recently in Cand\`{e}s, Fan, Janson and Lv (2016) in high-dimensional setting when the covariate distribution is characterized by Gaussian graphical model. We establish that under mild regularity conditions, the power of the oracle knockoffs procedure with known covariate distribution in high-dimensional linear models is asymptotically one as sample size goes to infinity. When moving away from the ideal case, we suggest the modified model-free knockoffs method called graphical nonlinear knockoffs (RANK) to accommodate the unknown covariate distribution. We provide theoretical justifications on the robustness of our modified procedure by showing that the false discovery rate (FDR) is asymptotically controlled at the target level and the power is asymptotically one with the estimated covariate distribution. To the best of our knowledge, this is the first formal theoretical result on the power for the knockoffs procedure. Simulation results demonstrate that compared to existing approaches, our method performs competitively in both FDR control and power. A real data set is analyzed to further assess the performance of the suggested knockoffs procedure.

artificial intelligence, machine learning, procedure, (18 more...)

arXiv.org Machine Learning

1709.00092

Country: North America > United States > California > Los Angeles County > Los Angeles (0.28)

Genre: Research Report > Experimental Study (0.67)

Industry: Health & Medicine > Pharmaceuticals & Biotechnology (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.47)

Add feedback

Modelling and computation using NCoRM mixtures for density regression

Griffin, Jim, Leisen, Fabrizio

arXiv.org Machine LearningAug-31-2017

Normalized compound random measures are flexible nonparametric priors for related distributions. We consider building general nonparametric regression models using normalized compound random measure mixture models. Posterior inference is made using a novel pseudo-marginal Metropolis-Hastings sampler for normalized compound random measure mixture models. The algorithm makes use of a new general approach to the unbiased estimation of Laplace functionals of compound random measures (which includes completely random measures as a special case). The approach is illustrated on problems of density regression.

artificial intelligence, exp, machine learning, (18 more...)

arXiv.org Machine Learning

1608.00874

Genre: Research Report (0.64)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.48)

Add feedback

Calibrating chemical multisensory devices for real world applications: An in-depth comparison of quantitative Machine Learning approaches

De Vito, S., Esposito, E., Salvato, M., Popoola, O., Formisano, F., Jones, R., Di Francia, G.

arXiv.org Artificial IntelligenceAug-30-2017

Chemical multisensor devices need calibration algorithms to estimate gas concentrations. Their possible adoption as indicative air quality measurements devices poses new challenges due to the need to operate in continuous monitoring modes in uncontrolled environments. Several issues, including slow dynamics, continue to affect their real world performances. At the same time, the need for estimating pollutant concentrations on board the devices, espe- cially for wearables and IoT deployments, is becoming highly desirable. In this framework, several calibration approaches have been proposed and tested on a variety of proprietary devices and datasets; still, no thorough comparison is available to researchers. This work attempts a benchmarking of the most promising calibration algorithms according to recent literature with a focus on machine learning approaches. We test the techniques against absolute and dynamic performances, generalization capabilities and computational/storage needs using three different datasets sharing continuous monitoring operation methodology. Our results can guide researchers and engineers in the choice of optimal strategy. They show that non-linear multivariate techniques yield reproducible results, outperforming lin- ear approaches. Specifically, the Support Vector Regression method consistently shows good performances in all the considered scenarios. We highlight the enhanced suitability of shallow neural networks in a trade-off between performance and computational/storage needs. We confirm, on a much wider basis, the advantages of dynamic approaches with respect to static ones that only rely on instantaneous sensor array response. The latter have been shown to be best choice whenever prompt and precise response is needed.

artificial intelligence, dataset, machine learning, (18 more...)

arXiv.org Artificial Intelligence

doi: 10.1016/j.snb.2017.07.155

1708.09175

Country:

Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.28)
North America > United States > California > San Francisco County > San Francisco (0.14)
Europe > Switzerland (0.14)
Europe > Italy (0.14)

Genre: Research Report > New Finding (1.00)

Industry:

Energy > Oil & Gas (0.92)
Materials > Chemicals > Commodity Chemicals > Petrochemicals (0.46)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Support Vector Machines (0.87)

Add feedback