AITopics

2011.11117

Genre: Research Report (0.50)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.54)

arXiv.org Machine LearningNov-22-2020

Robust Gaussian Process Regression Based on Iterative Trimming

Li, Zhao-Zhou, Li, Lu, Shao, Zhengyi

The model prediction of the Gaussian process (GP) regression can be significantly biased when the data are contaminated by outliers. We propose a new robust GP regression algorithm that iteratively trims a portion of the data points with the largest deviation from the predicted mean. While the new algorithm retains the attractive properties of the standard GP as a nonparametric and flexible regression method, it can significantly reduce the influence of outliers even in some extreme cases. It is also easier to implement than previous robust GP variants that rely on approximate inference. Applied to various synthetic datasets with contaminations, the proposed method outperforms the standard GP and the popular robust GP variant with the Student's t likelihood, especially when the outlier fraction is high. Lastly, as a practical example in the astrophysical study, we show that this method can determine the main-sequence ridge line precisely in the color-magnitude diagram of star clusters.

dataset, gaussian process, outlier, (14 more...)

2011.11057

Country:

Asia > China > Shanghai > Shanghai (0.05)
North America > United States > Washington > King County > Bellevue (0.04)
North America > United States > New York > New York County > New York City (0.04)
(5 more...)

Genre: Research Report (0.40)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.70)

#artificialintelligenceNov-21-2020, 12:40:11 GMT

The Confusion Behind Logistic Regression

Logistic Regression is the most confusing unsupervised Machine learning Algorithm. As it is a Classification type Algorithm, the word Regression in its naming often tends to confuse the newbies in the Data Science world. We know that Linear Regression is used to predict the output, which consists of continuous values. However, imagine a situation if we want to classify our continuous outputs into various classes. E.g., if we are data of marks of students and based on the percentage output, we want to classify whether it's pass or fail.

cost function, hypothesis, regression, (14 more...)

Genre:

Research Report > New Finding (0.64)
Research Report > Experimental Study (0.64)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (1.00)

#artificialintelligenceNov-20-2020, 15:41:06 GMT

Mistakes in Applying Univariate Feature Selection Methods

In Python scikit-learn library, there are various univariate feature selection methods such as Regression F-score, ANOVA and Chi-squared. Perhaps due to the ease of applying these methods (sometimes with just a single line of code), it might be tempting to just use these methods without taking into consideration the type of features you have. I have seen some machine learning practitioners took this for granted and made this mistake (including myself). While the scikit-learn documentation is clear on which feature selection method should be used for regression and classification, it does not specify whether these methods are suitable to apply to both continuous and categorical features. Let's say you have a classification task and after reading the documentation, you know you should use either Chi-squared test or ANOVA.

categorical feature, chi-squared test, continuous feature, (11 more...)

Genre: Research Report > Experimental Study (0.31)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.32)

arXiv.org Machine LearningNov-20-2020

On the Benefits of Multiple Gossip Steps in Communication-Constrained Decentralized Optimization

Hashemi, Abolfazl, Acharya, Anish, Das, Rudrajit, Vikalo, Haris, Sanghavi, Sujay, Dhillon, Inderjit

In decentralized optimization, it is common algorithmic practice to have nodes interleave (local) gradient descent iterations with gossip (i.e. averaging over the network) steps. Motivated by the training of large-scale machine learning models, it is also increasingly common to require that messages be {\em lossy compressed} versions of the local parameters. In this paper, we show that, in such compressed decentralized optimization settings, there are benefits to having {\em multiple} gossip steps between subsequent gradient iterations, even when the cost of doing so is appropriately accounted for e.g. by means of reducing the precision of compressed information. In particular, we show that having $O(\log\frac{1}{\epsilon})$ gradient iterations {with constant step size} - and $O(\log\frac{1}{\epsilon})$ gossip steps between every pair of these iterations - enables convergence to within $\epsilon$ of the optimal value for smooth non-convex objectives satisfying Polyak-\L{}ojasiewicz condition. This result also holds for smooth strongly convex objectives. To our knowledge, this is the first work that derives convergence results for nonconvex optimization under arbitrary communication compression.

algorithm, communication, convergence, (13 more...)

2011.10643

Country:

North America > United States > Texas > Travis County > Austin (0.04)
North America > United States > Massachusetts (0.04)

Genre: Research Report (0.64)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (0.93)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.68)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Gradient Descent (0.49)

Wang, Ke, Thrampoulidis, Christos

Benign Overfitting in Binary Classification of Gaussian Mixtures

arXiv.org Machine LearningNov-18-2020

Deep neural networks generalize well despite being exceedingly overparametrized, but understanding the statistical principles behind this so called benign-overfitting phenomenon is not yet well understood. Recently there has been remarkable progress towards understanding benign-overfitting in simpler models, such as linear regression and, even more recently, linear classification. This paper studies benign-overfitting for data generated from a popular binary Gaussian mixtures model (GMM) and classifiers trained by support-vector machines (SVM). Our approach has two steps. First, we leverage an idea introduced in (Muthukumar et al. 2020) to relate the SVM solution to the least-squares (LS) solution. Second, we derive novel non-asymptotic bounds on the classification error of LS solution. Combining the two gives sufficient conditions on the overparameterization ratio and the signal-to-noise ratio that lead to benign overfitting. We corroborate our theoretical findings with numerical simulations.

arxiv preprint arxiv, classification error, probability, (14 more...)

2011.09148

Country:

Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
North America > United States > California > Santa Barbara County > Santa Barbara (0.04)

Genre: Research Report (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Support Vector Machines (0.56)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.48)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.35)

#artificialintelligenceNov-17-2020, 13:30:59 GMT

Logistic Regression with PyTorch

We learned about linear regression in the last post, and now we move to logistic regression. As we cannot use linear regression for a classification task, we use logistic, which is an extension to linear regression for a classification task. Logistic regression is a good approach to problems that require probability as output. Suppose we create a model to predict a striker's probability when he is playing an away match. If a model predicts a p(score Away_Match) of 0.05, then overall away matches, the striker will score approximately 1 goal.

optimizer, probability, regression, (12 more...)

Genre:

Research Report > New Finding (0.83)
Research Report > Experimental Study (0.83)

Industry: Health & Medicine (0.31)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (1.00)

#artificialintelligenceNov-17-2020, 01:25:56 GMT

How do you know what independent variables to include in a regression model?

After a few biostatistics classes, I began fitting my first logistic regression model using my physician friend's data on tumors excised from skin cancer patients. I realized that although we were very clear about the dependent variable we were trying to predict – a certain feature of the tumor – I really did not know how to pick the independent variables that belonged in the model. We only had a few to choose from in our dataset, and I put them all into the model, but I wasn't really sure what to do next. Remove the ones that had a slope with a p 0.05? I asked one of my professors what to do, and in her own idiosyncratic way, she seemed to describe what I will call "stepwise selection".

independent variable, regression model, stepwise selection, (3 more...)

Genre:

Research Report > New Finding (0.59)
Research Report > Experimental Study (0.59)

Industry: Health & Medicine > Therapeutic Area > Oncology (0.95)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.97)

He, Yang-Hui, Lee, Kyu-Hwan, Oliver, Thomas

Machine-Learning Number Fields

arXiv.org Machine LearningNov-17-2020

We show that standard machine-learning algorithms may be trained to predict certain invariants of algebraic number fields to high accuracy. A random-forest classifier that is trained on finitely many Dedekind zeta coefficients is able to distinguish between real quadratic fields with class number 1 and 2, to 0.96 precision. Furthermore, the classifier is able to extrapolate to fields with discriminant outside the range of the training data. When trained on the coefficients of defining polynomials for Galois extensions of degrees 2, 6, and 8, a logistic regression classifier can distinguish between Galois groups and predict the ranks of unit groups with precision >0.97.

class number 1, classifier, number field, (12 more...)

2011.08958

Country:

North America > United States > Connecticut > Tolland County > Storrs (0.14)
North America > United States > Illinois > Champaign County > Champaign (0.04)
Europe > United Kingdom > England > Oxfordshire > Oxford (0.04)
(2 more...)

Genre: Research Report > New Finding (0.67)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (1.00)

arXiv.org Machine LearningNov-17-2020

Data Driven Reaction Mechanism Estimation via Transient Kinetics and Machine Learning

Kunz, M. Ross, Yonge, Adam, Fang, Zongtang, Medford, Andrew J., Constales, Denis, Yablonsky, Gregory, Fushimi, Rebecca

Understanding the set of elementary steps and kinetics in each reaction is extremely valuable to make informed decisions about creating the next generation of catalytic materials. With physical and mechanistic complexity of industrial catalysts, it is critical to obtain kinetic information through experimental methods. As such, this work details a methodology based on the combination of transient rate/concentration dependencies and machine learning to measure the number of active sites, the individual rate constants, and gain insight into the mechanism under a complex set of elementary steps. This new methodology was applied to simulated transient responses to verify its ability to obtain correct estimates of the micro-kinetic coefficients. Furthermore, experimental CO oxidation data was analyzed to reveal the Langmuir-Hinshelwood mechanism driving the reaction. As oxygen accumulated on the catalyst, a transition in the mechanism was clearly defined in the machine learning analysis due to the large amount of kinetic information available from transient reaction techniques. This methodology is proposed as a new data driven approach to characterize how materials control complex reaction mechanisms relying exclusively on experimental data.

carbon monoxide, coefficient, mechanism, (13 more...)

2011.0881

Country:

North America > United States > Pennsylvania > Allegheny County > Pittsburgh (0.04)
North America > United States > New Jersey (0.04)
North America > United States > Idaho (0.04)

Genre: Research Report > Experimental Study (0.46)

Industry:

Energy (1.00)
Materials > Chemicals (0.90)
Government > Regional Government > North America Government > United States Government (0.67)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.47)