AITopics | Regression

Collaborating Authors

Regression

News Overviews Instructional Materials AI-Alerts Classics

Deciding when to stop: Efficient stopping of active learning guided drug-target prediction

Temerinac-Ott, Maja, Naik, Armaghan W., Murphy, Robert F.

arXiv.org Machine LearningApr-9-2015

Active learning has shown to reduce the number of experiments needed to obtain high-confidence drug-target predictions. However, in order to actually save experiments using active learning, it is crucial to have a method to evaluate the quality of the current prediction and decide when to stop the experimentation process. Only by applying reliable stoping criteria to active learning, time and costs in the experimental process can be actually saved. We compute active learning traces on simulated drug-target matrices in order to learn a regression model for the accuracy of the active learner. By analyzing the performance of the regression model on simulated data, we design stopping criteria for previously unseen experimental matrices. We demonstrate on four previously characterized drug effect data sets that applying the stopping criteria can result in upto 40% savings of the total experiments for highly accurate predictions.

artificial intelligence, experiment, machine learning, (17 more...)

arXiv.org Machine Learning

1504.02406

Country:

North America > United States (0.14)
Europe > Germany (0.14)

Genre: Research Report (1.00)

Industry: Health & Medicine > Pharmaceuticals & Biotechnology (1.00)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.55)

Add feedback

Strong oracle optimality of folded concave penalized estimation

Fan, Jianqing, Xue, Lingzhou, Zou, Hui

arXiv.org Machine LearningApr-2-2015

Folded concave penalization methods have been shown to enjoy the strong oracle property for high-dimensional sparse estimation. However, a folded concave penalization problem usually has multiple local solutions and the oracle property is established only for one of the unknown local solutions. A challenging fundamental issue still remains that it is not clear whether the local optimum computed by a given optimization algorithm possesses those nice theoretical properties. To close this important theoretical gap in over a decade, we provide a unified theory to show explicitly how to obtain the oracle solution via the local linear approximation algorithm. For a folded concave penalized estimation problem, we show that as long as the problem is localizable and the oracle estimator is well behaved, we can obtain the oracle estimator by using the one-step local linear approximation. In addition, once the oracle estimator is obtained, the local linear approximation algorithm converges, namely it produces the same estimator in the next iteration. The general theory is demonstrated by using four classical sparse estimation problems, that is, sparse linear regression, sparse logistic regression, sparse precision matrix estimation and sparse quantile regression.

artificial intelligence, machine learning, oracle, (18 more...)

arXiv.org Machine Learning

doi: 10.1214/13-AOS1198

1210.5992

Country: North America > United States > Minnesota > Hennepin County > Minneapolis (0.28)

Genre: Research Report > New Finding (0.48)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.69)

Add feedback

Modeling the Lifespan of Discourse Entities with Application to Coreference Resolution

de Marneffe, Marie-Catherine, Recasens, Marta, Potts, Christopher

Journal of Artificial Intelligence ResearchMar-31-2015

A discourse typically involves numerous entities, but few are mentioned more than once. Distinguishing those that die out after just one mention (singleton) from those that lead longer lives (coreferent) would dramatically simplify the hypothesis space for coreference resolution models, leading to increased performance. To realize these gains, we build a classifier for predicting the singleton/coreferent distinction. The models feature representations synthesize linguistic insights about the factors affecting discourse entity lifespans (especially negation, modality, and attitude predication) with existing results about the benefits of surface (part-of-speech and n-gram-based) features for coreference resolution. The model is effective in its own right, and the feature representations help to identify the anchor phrases in bridging anaphora as well. Furthermore, incorporating the model into two very different state-of-the-art coreference resolution systems, one rule-based and the other learning-based, yields significant performance improvements.

discourse referent, lifespan model, singleton, (14 more...)

Journal of Artificial Intelligence Research

doi: 10.1613/jair.4565

AI Access Foundation

10934

Journal of Artificial Intelligence Research

Country:

Asia > Middle East > Israel (0.04)
Europe > United Kingdom > England > Oxfordshire > Oxford (0.04)
North America > United States > Texas (0.04)
(8 more...)

Genre: Research Report > New Finding (0.47)

Industry: Media (0.67)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Grammars & Parsing (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.46)

Add feedback

Sparse Linear Regression With Missing Data

Ganti, Ravi, Willett, Rebecca M.

arXiv.org Machine LearningMar-28-2015

This paper proposes a fast and accurate method for sparse regression in the presence of missing data. The underlying statistical model encapsulates the low-dimensional structure of the incomplete data matrix and the sparsity of the regression coefficients, and the proposed algorithm jointly learns the low-dimensional structure of the data and a linear regressor with sparse coefficients. The proposed stochastic optimization method, Sparse Linear Regression with Missing Data (SLRM), performs an alternating minimization procedure and scales well with the problem size. Large deviation inequalities shed light on the impact of the various problem-dependent parameters on the expected squared loss of the learned regressor. Extensive simulations on both synthetic and real datasets show that SLRM performs better than competing algorithms in a variety of contexts.

artificial intelligence, dataset, machine learning, (18 more...)

arXiv.org Machine Learning

1503.08348

Country:

North America > United States > Wisconsin (0.28)
North America > United States > Minnesota > Hennepin County > Minneapolis (0.28)

Genre: Research Report (0.82)

Industry: Health & Medicine > Therapeutic Area > Oncology > Leukemia (0.46)

Technology:

Information Technology > Data Science (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (1.00)

Add feedback

Penalty, Shrinkage, and Preliminary Test Estimators under Full Model Hypothesis

Raheem, Enayetur, Saleh, A. K. Md. Ehsanes

arXiv.org Machine LearningMar-24-2015

This paper considers a multiple regression model and compares, under full model hypothesis, analytically as well as by simulation, the performance characteristics of some popular penalty estimators such as ridge regression, LASSO, adaptive LASSO, SCAD, and elastic net versus Least Squares Estimator, restricted estimator, preliminary test estimator, and Stein-type estimators when the dimension of the parameter space is smaller than the sample space dimension. We find that RR uniformly dominates LSE, RE, PTE, SE and PRSE while LASSO, aLASSO, SCAD, and EN uniformly dominates LSE only. Further, it is observed that neither penalty estimators nor Stein-type estimator dominate one another.

artificial intelligence, estimator, machine learning, (15 more...)

arXiv.org Machine Learning

1503.0691

Country: North America > United States > Colorado (0.28)

Genre: Research Report (1.00)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.49)

Add feedback

Fast Imbalanced Classification of Healthcare Data with Missing Values

Razzaghi, Talayeh, Roderick, Oleg, Safro, Ilya, Marko, Nick

arXiv.org Machine LearningMar-20-2015

In medical domain, data features often contain missing values. This can create serious bias in the predictive modeling. Typical standard data mining methods often produce poor performance measures. In this paper, we propose a new method to simultaneously classify large datasets and reduce the effects of missing values. The proposed method is based on a multilevel framework of the cost-sensitive SVM and the expected maximization imputation method for missing values, which relies on iterated regression analyses. We compare classification results of multilevel SVM-based algorithms on public benchmark datasets with imbalanced classes and missing values as well as real data in health applications, and show that our multilevel SVM-based method produces fast, and more accurate and robust classification results.

bioinformatics, data mining, machine learning, (17 more...)

arXiv.org Machine Learning

1503.0625

Country: North America > United States > California (0.28)

Genre:

Research Report > New Finding (0.67)
Research Report > Experimental Study (0.49)

Industry:

Health & Medicine > Health Care Providers & Services (0.95)
Health & Medicine > Therapeutic Area > Endocrinology > Diabetes (0.68)

Technology:

Information Technology > Data Science > Data Mining (1.00)
Information Technology > Biomedical Informatics (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.69)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.67)

Add feedback

MIST: L0 Sparse Linear Regression with Momentum

Marjanovic, Goran, Ulfarsson, Magnus O., Hero, Alfred O. III

arXiv.org Machine LearningMar-18-2015

In the current age of big data acquisition there has been an ever growing interest in sparse representations, which consists of representing, say, a noisy signal as a linear combination of very few components. This implies that the entire information in the signal can be approximately captured by a small number of components, which has huge benefits in analysis, processing and storage of high dimensional signals. As a result, sparse linear regression has been widely studied with many applications in signal and image processing, statistical inference and machine learning. Specific applications include compressed sensing, denoising, inpainting, deblurring, source separation, sparse image reconstruction, and signal classification, etc.

algorithm, artificial intelligence, machine learning, (15 more...)

arXiv.org Machine Learning

1409.7193

Country: North America > United States > Michigan (0.28)

Genre: Research Report (0.40)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.71)

Add feedback

A General Framework for Robust Testing and Confidence Regions in High-Dimensional Quantile Regression

Zhao, Tianqi, Kolar, Mladen, Liu, Han

arXiv.org Machine LearningMar-17-2015

We propose a robust inferential procedure for assessing uncertainties of parameter estimation in high-dimensional linear models, where the dimension $p$ can grow exponentially fast with the sample size $n$. Our method combines the de-biasing technique with the composite quantile function to construct an estimator that is asymptotically normal. Hence it can be used to construct valid confidence intervals and conduct hypothesis tests. Our estimator is robust and does not require the existence of first or second moment of the noise distribution. It also preserves efficiency in the sense that the worst case efficiency loss is less than 30\% compared to the square-loss-based de-biased Lasso estimator. In many cases our estimator is close to or better than the latter, especially when the noise is heavy-tailed. Our de-biasing procedure does not require solving the $L_1$-penalized composite quantile regression. Instead, it allows for any first-stage estimator with desired convergence rate and empirical sparsity. The paper also provides new proof techniques for developing theoretical guarantees of inferential procedures with non-smooth loss functions. To establish the main results, we exploit the local curvature of the conditional expectation of composite quantile loss and apply empirical process theories to control the difference between empirical quantities and their conditional expectations. Our results are established under weaker assumptions compared to existing work on inference for high-dimensional quantile regression. Furthermore, we consider a high-dimensional simultaneous test for the regression parameters by applying the Gaussian approximation and multiplier bootstrap theories. We also study distributed learning and exploit the divide-and-conquer estimator to reduce computation complexity when the sample size is massive. Finally, we provide empirical results to verify the theory.

artificial intelligence, estimator, machine learning, (18 more...)

arXiv.org Machine Learning

1412.8724

Country: North America > United States (0.28)

Genre:

Research Report > Experimental Study (0.67)
Research Report > New Finding (0.66)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (0.93)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.46)

Add feedback

A Multi-Gene Genetic Programming Application for Predicting Students Failure at School

Orove, J. O., Osegi, N. E., Eke, B. O.

arXiv.org Artificial IntelligenceMar-11-2015

ABSTRACT Several efforts to predict student failure rate (SFR) at school accurately still remains a core problem area faced by many in the educational sector. The procedure for forecasting SFR are rigid and most often times require data scaling or conversion into binary form such as is the case of the logistic model which may lead to lose of information and effect size attenuation. Currently the application of Genetic Programming (GP) holds great promises and has produced tremendous positive results in different sectors. In this regard, this study developed GPSFARPS, a software application to provide a robust solution to the prediction of SFR using an evolutionary algorithm known as multi-gene genetic programming. The approach is validated by feeding a testing data set to the evolved GP models. Result obtained from GPSFARPS simulations show its unique ability to evolve a suitable failure rate expression with a fast convergence at 30 generations from a maximum specified generation of 500. The multigene system was also able to minimize the evolved model expression and accurately predict student failure rate using a subset of the original expression. Keywords: Genetic Programming, Student Failure Rate, Multi-Gene GP 1. INTRODUCTION SFR has always being and will continue to be a major concern to stakeholders in the educational sector.

artificial intelligence, evolutionary algorithm, machine learning, (15 more...)

arXiv.org Artificial Intelligence

1503.03211

Country:

North America > United States (0.28)
Africa > Nigeria > Rivers State (0.14)

Genre: Research Report > New Finding (1.00)

Industry: Education > Educational Setting > K-12 Education > Secondary School (0.46)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Evolutionary Systems (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.69)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.47)

Add feedback

Minimax Optimal Rates of Estimation in High Dimensional Additive Models: Universal Phase Transition

Yuan, Ming, Zhou, Ding-Xuan

arXiv.org Machine LearningMar-10-2015

Our results reveal an interesting phase transition behavior universal to this class of high dimensional problems. In the sparse regime when the components are sufficiently smooth or the dimensionality is sufficiently large, the optimal rates are identical to those for high dimensional linear regression, and therefore there is no additional cost to entertain a nonparametric model. Otherwise, in the so-called smooth regime, the rates coincide with the optimal rates for estimating a univariate function, and therefore they are immune to the "curse of dimensionality". Key words: Convergence rate, method of regularization, minimax optimality, phase transition, reproducing kernel Hilbert space, Sobolev space. 2 1 Introduction With the recent advances in science and technology, high dimensional regression problems have become ubiquitous in a multitude of areas - genomics, medical imaging, and finance are a few well known examples. Considerable amount of research effort has been devoted to the understanding of challenges brought about by the high dimensionality, and development of statistical methodology to counter them.

artificial intelligence, machine learning, probability, (16 more...)

arXiv.org Machine Learning

1503.02817

Country:

Asia (0.46)
North America > United States > Wisconsin (0.28)

Genre: Research Report > New Finding (0.48)

Industry: Health & Medicine (0.88)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.49)

Add feedback