AITopics | Regression

Collaborating Authors

Regression

News Overviews Instructional Materials AI-Alerts Classics

Inject Machine Learning into Significance Test for Misspecified Linear Models

arXiv.org Machine LearningJun-4-2020

Due to its strong interpretability, linear regression is widely used in social science, from which significance test provides the significance level of models or coefficients in the traditional statistical inference. However, linear regression methods rely on the linear assumptions of the ground truth function, which do not necessarily hold in practice. As a result, even for simple non-linear cases, linear regression may fail to report the correct significance level. In this paper, we present a simple and effective assumption-free method for linear approximation in both linear and non-linear scenarios. First, we apply a machine learning method to fit the ground truth function on the training set and calculate its linear approximation. Afterward, we get the estimator by adding adjustments based on the validation set. We prove the concentration inequalities and asymptotic properties of our estimator, which leads to the corresponding significance test. Experimental results show that our estimator significantly outperforms linear regression for non-linear ground truth functions, indicating that our estimator might be a better tool for the significance test.

artificial intelligence, estimator, machine learning, (15 more...)

arXiv.org Machine Learning

2006.03167

Country:

Asia > China > Shanghai > Shanghai (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
Asia > China > Beijing > Beijing (0.04)

Genre: Research Report > Experimental Study (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.46)

Add feedback

Generalized Penalty for Circular Coordinate Representation

Luo, Hengrui, Patania, Alice, Kim, Jisu, Vejdemo-Johansson, Mikael

arXiv.org Machine LearningJun-3-2020

Topological Data Analysis (TDA) provides novel approaches that allow us to analyze the geometrical shapes and topological structures of a dataset. As one important application, TDA can be used for data visualization and dimension reduction. We follow the framework of circular coordinate representation, which allows us to perform dimension reduction and visualization for high-dimensional datasets on a torus using persistent cohomology. In this paper, we propose a method to adapt the circular coordinate framework to take into account sparsity in high-dimensional applications. We use a generalized penalty function instead of an $L_{2}$ penalty in the traditional circular coordinate algorithm. We provide simulation experiments and real data analysis to support our claim that circular coordinates with generalized penalty will accommodate the sparsity in high-dimensional datasets under different sampling schemes while preserving the topological structures.

artificial intelligence, machine learning, spatial reasoning, (14 more...)

arXiv.org Machine Learning

2006.02554

Country:

North America > United States > New York > Richmond County > New York City (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
North America > United States > Ohio > Franklin County > Columbus (0.04)
(3 more...)

Genre: Research Report > Promising Solution (0.34)

Industry: Government > Regional Government (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Spatial Reasoning (0.87)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.46)

Add feedback

On the Minimax Optimality of the EM Algorithm for Learning Two-Component Mixed Linear Regression

Kwon, Jeongyeol, Ho, Nhat, Caramanis, Constantine

arXiv.org Machine LearningJun-3-2020

We study the convergence rates of the EM algorithm for learning two-component mixed linear regression under all regimes of signal-to-noise ratio (SNR). We resolve a long-standing question that many recent results have attempted to tackle: we completely characterize the convergence behavior of EM, and show that the EM algorithm achieves minimax optimal sample complexity under all SNR regimes. In particular, when the SNR is sufficiently large, the EM updates converge to the true parameter $\theta^{*}$ at the standard parametric convergence rate $\mathcal{O}((d/n)^{1/2})$ after $\mathcal{O}(\log(n/d))$ iterations. In the regime where the SNR is above $\mathcal{O}((d/n)^{1/4})$ and below some constant, the EM iterates converge to a $\mathcal{O}({\rm SNR}^{-1} (d/n)^{1/2})$ neighborhood of the true parameter, when the number of iterations is of the order $\mathcal{O}({\rm SNR}^{-2} \log(n/d))$. In the low SNR regime where the SNR is below $\mathcal{O}((d/n)^{1/4})$, we show that EM converges to a $\mathcal{O}((d/n)^{1/4})$ neighborhood of the true parameters, after $\mathcal{O}((n/d)^{1/2})$ iterations. Notably, these results are achieved under mild conditions of either random initialization or an efficiently computable local initialization. By providing tight convergence guarantees of the EM algorithm in middle-to-low SNR regimes, we fill the remaining gap in the literature, and significantly, reveal that in low SNR, EM changes rate, matching the $n^{-1/4}$ rate of the MLE, a behavior that previous work had been unable to show.

artificial intelligence, machine learning, snr regime, (18 more...)

arXiv.org Machine Learning

2006.02601

Country:

Asia > Middle East > Jordan (0.04)
North America > United States > New York > New York County > New York City (0.04)
North America > United States > Texas > Travis County > Austin (0.04)
(2 more...)

Genre: Research Report > New Finding (0.46)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.71)

Add feedback

Prediction of short and long-term droughts using artificial neural networks and hydro-meteorological variables

Hassanzadeh, Yousef, Ghazvinian, Mohammadvaghef, Abdi, Amin, Baharvand, Saman, Jozaghi, Ali

arXiv.org Machine LearningJun-3-2020

Drought is a natural creeping threat with numerous damaging effects in various aspects of human life. Accurate drought prediction is a promising step in helping policy makers to set drought risk management strategies. To fulfill this purpose, choosing appropriate models plays an important role in predicting approach. In this study, different models of Artificial Neural Network (ANN) are employed to predict short and long-term of droughts by using Standardized Precipitation Index (SPI) at different time scales, including 3, 6, 12, 24 and 48 months in Tabriz city, Iran. To this end, different combination of calculated SPI and time series of various hydro-meteorological variables, such as precipitation, wind velocity, relative humidity and sunshine hours for years 1992 to 2010 are used to train the ANN models. In order to compare the models performances, some well-known measures, namely RMSE, Mean Absolute Error (MAE) and Correlation Coefficient (CC) are utilized in the present study. The results illustrate that the application of all hydro-meteorological variables significantly improves the prediction of SPI at different time scales.

artificial intelligence, fuzzy logic, machine learning, (19 more...)

arXiv.org Machine Learning

2006.02581

Country:

Asia > Middle East > Iran > East Azerbaijan Province > Tabriz (0.26)
Africa > East Africa (0.14)
North America > Trinidad and Tobago > Trinidad > Arima > Arima (0.05)
(17 more...)

Genre: Research Report (0.71)

Industry:

Government > Regional Government > North America Government > United States Government (0.46)
Water & Waste Management > Water Management (0.46)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Fuzzy Logic (0.68)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Support Vector Machines (0.47)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.46)

Add feedback

Local Interpretability of Calibrated Prediction Models: A Case of Type 2 Diabetes Mellitus Screening Test

Kocbek, Simon, Kocbek, Primoz, Cilar, Leona, Stiglic, Gregor

arXiv.org Machine LearningJun-2-2020

Machine Learning (ML) models are often complex and difficult to interpret due to their 'black-box' characteristics. Interpretability of a ML model is usually defined as the degree to which a human can understand the cause of decisions reached by a ML model. Interpretability is of extremely high importance in many fields of healthcare due to high levels of risk related to decisions based on ML models. Calibration of the ML model outputs is another issue often overlooked in the application of ML models in practice. This paper represents an early work in examination of prediction model calibration impact on the interpretability of the results. We present a use case of a patient in diabetes screening prediction scenario and visualize results using three different techniques to demonstrate the differences between calibrated and uncalibrated regularized regression model.

artificial intelligence, interpretability, machine learning, (15 more...)

arXiv.org Machine Learning

2006.13815

Country:

Europe > Slovenia > Drava > Municipality of Maribor > Maribor (0.06)
North America > United States > California > San Diego County > San Diego (0.05)
Asia > Middle East > Israel (0.05)
(12 more...)

Genre: Research Report > New Finding (0.95)

Industry: Health & Medicine > Therapeutic Area > Endocrinology > Diabetes (1.00)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.49)

Add feedback

Application of Machine Learning to Predict the Risk of Alzheimer's Disease: An Accurate and Practical Solution for Early Diagnostics

Cochrane, Courtney, Castineira, David, Shiban, Nisreen, Protopapas, Pavlos

arXiv.org Machine LearningJun-2-2020

Alzheimer's Disease (AD) ravages the cognitive ability of more than 5 million Americans and creates an enormous strain on the health care system. This paper proposes a machine learning predictive model for AD development without medical imaging and with fewer clinical visits and tests, in hopes of earlier and cheaper diagnoses. That earlier diagnoses could be critical in the effectiveness of any drug or medical treatment to cure this disease. Our model is trained and validated using demographic, biomarker and cognitive test data from two prominent research studies: Alzheimer's Disease Neuroimaging Initiative (ADNI) and Australian Imaging, Biomarker & Lifestyle Flagship Study of Aging (AIBL). We systematically explore different machine learning models, pre-processing methods and feature selection techniques. The most performant model demonstrates greater than 90% accuracy and recall in predicting AD, and the results generalize across sub-studies of ADNI and to the independent AIBL study. We also demonstrate that these results are robust to reducing the number of clinical visits or tests per visit. Using a metaclassification algorithm and longitudinal data analysis we are able to produce a "lean" diagnostic protocol with only 3 tests and 4 clinical visits that can predict Alzheimer's development with 87% accuracy and 79% recall. This novel work can be adapted into a practical early diagnostic tool for predicting the development of Alzheimer's that maximizes accuracy while minimizing the number of necessary diagnostic tests and clinical visits.

alzheimer, artificial intelligence, machine learning, (17 more...)

arXiv.org Machine Learning

2006.08702

Country:

North America > United States > New York (0.04)
Oceania > Australia (0.04)
North America > United States > New Jersey > Hudson County > Secaucus (0.04)
(2 more...)

Genre:

Research Report > New Finding (1.00)
Research Report > Experimental Study (0.93)

Industry: Health & Medicine > Therapeutic Area > Neurology > Alzheimer's Disease (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.68)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.46)

Add feedback

3 Levels of Data Science

#artificialintelligenceJun-1-2020, 03:32:01 GMT

This article will discuss what I consider to be the three levels of data science competency, namely: level 1 (basic level); level 2 (intermediate level); and level 3 (advanced level). Competency increases from level 1 to 3. We shall use Python as the default language, even though other platforms such as R, SAS, and Matlab could be used as programming languages for data science. The views provided here are my views and are based on my own journey to data science. At level one, a data science aspirant should be able to work with datasets generally presented in comma-separated values (CSV) file format. They should have competency in data basics; data visualization; and linear regression.

artificial intelligence, competency, machine learning, (7 more...)

#artificialintelligence

Industry: Education (0.57)

Technology:

Information Technology > Data Science (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.59)

Add feedback

Least-squares regressions via randomized Hessians

Kahale, Nabil

arXiv.org Machine LearningJun-1-2020

The recent availability of massive volumes of data fosters the need to design computationally efficient algorithms for optimization in high dimensions. In large-scale machine learning, stochastic gradient descent algorithms are among the most effective optimization methods (Bottou, Curtis and Nocedal 2018). For general smooth convex functions, averaged SGD achieves the rate of convergence of O(1/ k) after k iterations (Nemirovski, Juditsky, Lan and Shapiro 2009). For strongly-convex functions, i.e. when the smallest eigenvalue of the Hessian matrix is bounded away from 0, the convergence rate after k iterations is O(1/k) (Nemirovski, Juditsky, Lan and Shapiro 2009). Variance-reduced SGD algorithms that optimize the sum of n convex functions are described in (Schmidt, Le Roux and Bach 2017, Shalev-Shwartz and Zhang 2013, Johnson and Zhang 2013), and related accelerated methods are analysed in (Shalev-Shwartz and Zhang 2014, Nitanda 2014, Lan and Zhou 2018, Scieur, dAspremont and Bach 2018). These methods enjoy linear convergence(a convergence rate that decreases exponentially with the number of iterations) in the strongly-convex case. For general smooth convex functions, the stochastic average gradient method (SAG) of Schmidt, Le Roux and Bach (2017) yields a convergence rate of O( n/k) after k iterations. This paper focuses on the least-squares regression, which often arises in scientific computing and data analysis, and is widely used for inference and prediction. Many of the modern machine learning techniques such as the logistic and ridge regressions, the lasso method and neural networks can be considered as extensions of the least-squares regression technique.

algorithm, artificial intelligence, machine learning, (17 more...)

arXiv.org Machine Learning

2006.01017

Country:

North America > United States > New York (0.04)
Europe > France > Île-de-France > Paris > Paris (0.04)

Genre: Research Report (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Gradient Descent (0.56)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.34)

Add feedback

Logistic Regression for Massive Data with Rare Events

Wang, HaiYing

arXiv.org Machine LearningMay-31-2020

This paper studies binary logistic regression for rare events data, or imbalanced data, where the number of events (observations in one class, often called cases) is significantly smaller than the number of nonevents (observations in the other class, often called controls). We first derive the asymptotic distribution of the maximum likelihood estimator (MLE) of the unknown parameter, which shows that the asymptotic variance convergences to zero in a rate of the inverse of the number of the events instead of the inverse of the full data sample size. This indicates that the available information in rare events data is at the scale of the number of events instead of the full data sample size. Furthermore, we prove that under-sampling a small proportion of the nonevents, the resulting under-sampled estimator may have identical asymptotic distribution to the full data MLE. This demonstrates the advantage of under-sampling nonevents for rare events data, because this procedure may significantly reduce the computation and/or data collection costs. Another common practice in analyzing rare events data is to over-sample (replicate) the events, which has a higher computational cost. We show that this procedure may even result in efficiency loss in terms of parameter estimation.

artificial intelligence, estimator, machine learning, (17 more...)

arXiv.org Machine Learning

2006.00683

Country:

North America > United States > Connecticut (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)

Genre:

Research Report > New Finding (0.87)
Research Report > Experimental Study (0.55)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.71)

Add feedback

A machine learning approach for forecasting hierarchical time series

Mancuso, Paolo, Piccialli, Veronica, Sudoso, Antonio M.

arXiv.org Machine LearningMay-31-2020

In this paper, we propose a machine learning approach for forecasting hierarchical time series. Rather than using historical or forecasted proportions, as in standard top-down approaches, we formulate the disaggregation problem as a non-linear regression problem. We propose a deep neural network that automatically learns how to distribute the top-level forecasts to the bottom level-series of the hierarchy, keeping into account the characteristics of the aggregate series and the information of the individual series. In order to evaluate the performance of the proposed method, we analyze hierarchical sales data and electricity demand data. Besides comparison with the top-down approaches, the model is compared with the bottom-up method and the optimal reconciliation method. Results demonstrate that our method does not only increase the average forecasting accuracy of the hierarchy but also addresses the need of building an automated procedure generating coherent forecasts for many time series at the same time.

artificial intelligence, forecast, machine learning, (17 more...)

arXiv.org Machine Learning

2006.0063

Country:

North America > Trinidad and Tobago > Trinidad > Arima > Arima (0.04)
Europe > Switzerland (0.04)

Genre: Research Report (1.00)

Industry: Energy > Power Industry (0.34)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

Add feedback