AITopics | Regression

Collaborating Authors

Regression

News Overviews Instructional Materials AI-Alerts Classics

Decision functions from supervised machine learning algorithms as collective variables for accelerating molecular simulations

arXiv.org Machine LearningFeb-28-2018

Selection of appropriate collective variables for enhancing molecular simulations remains an unsolved problem in computational biophysics. In particular, picking initial collective variables (CVs) is particularly challenging in higher dimensions. Which atomic coordinates or transforms there of from a list of thousands should one pick for enhanced sampling runs? How does a modeler even begin to pick starting coordinates for investigation? This remains true even in the case of simple two state systems and only increases in difficulty for multi-state systems. In this work, we attempt to solve the initial CV problem using a data-driven approach inspired by supervised machine learning literature. In particular, we show how the decision functions in supervised machine learning (SML) algorithms can be used as initial CVs for accelerated sampling. Using solvated alanine dipeptide and Chignolin mini-protein as our test cases, we illustrate how the distance to the Support Vector Machines decision hyperplane, the output probability estimates from Logistic Regression, and other classifiers may be used to reversibly sample slow structural transitions. We discuss the utility of other SML algorithms that might be useful for identifying CVs for accelerating molecular simulations.

artificial intelligence, machine learning, simulation, (19 more...)

arXiv.org Machine Learning

1802.1051

Country: North America > United States > California (0.28)

Genre: Research Report > New Finding (0.35)

Industry:

Health & Medicine > Pharmaceuticals & Biotechnology (0.46)
Education (0.34)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Support Vector Machines (0.69)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.67)

Add feedback

Machine Learning Basics -- Part 1 -- Concept of Regression

#artificialintelligenceFeb-27-2018, 22:22:43 GMT

In this article I revisit the learned material from the amazing machine learning course by Andre Ng on coursera and create an overview about the concepts. All quotes refer to the material from the course if not explicitly stated otherwise. Linear regression tries to fit points to a line generated by an algorithm. This optimized line (the model) is capable of predicting values for certain input values and can be plotted. We want to set the parameters in order to achieve a minimal difference between the predicted and the real values.

artificial intelligence, gradient descent, machine learning, (13 more...)

#artificialintelligence

Industry: Education (0.77)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.37)

Add feedback

Machine Learning Crash Course, Part II: Unsupervised Machine Learning IoT For All

#artificialintelligenceFeb-27-2018, 16:47:00 GMT

In part one of the machine learning crash course, we introduced the field of supervised machine learning (ML) by walking through popular algorithms like linear regression and logistic regression. But supervised learning is just one of the many types of algorithms in the vast machine learning / artificial intelligence space. In this article, we take a look at two other subdisciplines: Unsupervised learning and deep learning. When performing supervised learning, our datasets consisted of labeled examples. In the linear regression example, we had TV advertising data labeled with the amount of sales generated.

algorithm, artificial intelligence, machine learning, (14 more...)

#artificialintelligence

Genre: Instructional Material > Course Syllabus & Notes (0.62)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.78)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.59)

Add feedback

A Tour of The Top 10 Algorithms for Machine Learning Newbies

#artificialintelligenceFeb-27-2018, 12:18:29 GMT

In machine learning, there's something called the "No Free Lunch" theorem. In a nutshell, it states that no one algorithm works best for every problem, and it's especially relevant for supervised learning (i.e. For example, you can't say that neural networks are always better than decision trees or vice-versa. There are many factors at play, such as the size and structure of your dataset. As a result, you should try many different algorithms for your problem, while using a hold-out "test set" of data to evaluate performance and select the winner.

algorithm, artificial intelligence, machine learning, (14 more...)

#artificialintelligence

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.40)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.31)

Add feedback

High-dimensional ABC

Nott, D. J., Ong, V. M. -H., Fan, Y., Sisson, S. A.

arXiv.org Machine LearningFeb-27-2018

This Chapter, "High-dimensional ABC", is to appear in the forthcoming Handbook of Approximate Bayesian Computation (2018). It details the main ideas and concepts behind extending ABC methods to higher dimensions, with supporting examples and illustrations.

approximation, artificial intelligence, machine learning, (18 more...)

arXiv.org Machine Learning

1802.09725

Country: Oceania > Australia (0.46)

Genre: Research Report (0.64)

Industry: Health & Medicine (0.68)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.93)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.89)

Add feedback

Testing for Feature Relevance: The HARVEST Algorithm

Weisberg, Herbert, Pontes, Victor, Thoma, Mathis

arXiv.org Machine LearningFeb-27-2018

Feature selection with high-dimensional data and a very small proportion of relevant features poses a severe challenge to standard statistical methods. We have developed a new approach (HARVEST) that is straightforward to apply, albeit somewhat computer-intensive. This algorithm can be used to pre-screen a large number of features to identify those that are potentially useful. The basic idea is to evaluate each feature in the context of many random subsets of other features. HARVEST is predicated on the assumption that an irrelevant feature can add no real predictive value, regardless of which other features are included in the subset. Motivated by this idea, we have derived a simple statistical test for feature relevance. Empirical analyses and simulations produced so far indicate that the HARVEST algorithm is highly effective in predictive analytics, both in science and business.

artificial intelligence, data mining, machine learning, (20 more...)

arXiv.org Machine Learning

1710.0021

Genre: Research Report > Experimental Study (0.95)

Industry: Health & Medicine > Therapeutic Area > Oncology (0.46)

Technology:

Information Technology > Data Science > Data Mining (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.94)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.69)

Add feedback

Data Science Simplified Part 6: Model Selection Methods

@machinelearnbotFeb-26-2018, 05:20:22 GMT

In the last article of this series, we had discussed multivariate linear regression model. Fernando creates a model that estimates the price of the car based on five input parameters. Fernando indeed has a better model. Yet, he wanted to select the best set of variables for input. The idea of model selection method is intuitive. How is an optimal model defined?

artificial intelligence, fernando, machine learning, (12 more...)

@machinelearnbot

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.76)

Add feedback

Interpreting Complex Regression Models

Avigdor-Elgrabli, Noa, Libov, Alex, Viderman, Michael, Wolff, Ran

arXiv.org Machine LearningFeb-26-2018

Interpretation of a machine learning induced models is critical for feature engineering, debugging, and, arguably, compliance. Yet, best of breed machine learning models tend to be very complex. This paper presents a method for model interpretation which has the main benefit that the simple interpretations it provides are always grounded in actual sets of learning examples. The method is validated on the task of interpreting a complex regression model in the context of both an academic problem -- predicting the year in which a song was recorded and an industrial one -- predicting mail user churn.

algorithm, artificial intelligence, machine learning, (15 more...)

arXiv.org Machine Learning

1802.09225

Country:

North America > United States > New York (0.14)
Asia > Middle East > Israel (0.14)

Genre: Research Report (0.65)

Industry: Information Technology > Security & Privacy (0.68)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.61)

Add feedback

Lasso Regularization Paths for NARMAX Models via Coordinate Descent

Ribeiro, Antônio H., Aguirre, Luis A.

arXiv.org Machine LearningFeb-26-2018

We propose a new algorithm for estimating NARMAX models with $L_1$ regularization for models represented as a linear combination of basis functions. Due to the $L_1$-norm penalty the Lasso estimation tends to produce some coefficients that are exactly zero and hence gives interpretable models. The novelty of the contribution is the inclusion of error regressors in the Lasso estimation (which yields a nonlinear regression problem). The proposed algorithm uses cyclical coordinate descent to compute the parameters of the NARMAX models for the entire regularization path. It deals with the error terms by updating the regressor matrix along with the parameter vector. In comparative timings we find that the modification does not reduce the computational efficiency of the original algorithm and can provide the most important regressors in very few inexpensive iterations. The method is illustrated for linear and polynomial models by means of two examples.

algorithm, artificial intelligence, machine learning, (16 more...)

arXiv.org Machine Learning

1710.00598

Country: South America > Brazil (0.28)

Genre: Research Report (0.82)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (1.00)

Add feedback

On b-bit min-wise hashing for large-scale regression and classification with sparse data

Shah, Rajen D., Meinshausen, Nicolai

arXiv.org Machine LearningFeb-26-2018

Large-scale regression problems where both the number of variables, $p$, and the number of observations, $n$, may be large and in the order of millions or more, are becoming increasingly more common. Typically the data are sparse: only a fraction of a percent of the entries in the design matrix are non-zero. Nevertheless, often the only computationally feasible approach is to perform dimension reduction to obtain a new design matrix with far fewer columns and then work with this compressed data. $b$-bit min-wise hashing (Li and Konig, 2011) is a promising dimension reduction scheme for sparse matrices which produces a set of random features such that regression on the resulting design matrix approximates a kernel regression with the resemblance kernel. In this work, we derive bounds on the prediction error of such regressions. For both linear and logistic models we show that the average prediction error vanishes asymptotically as long as $q \|\beta^*\|_2^2 /n \rightarrow 0$, where $q$ is the average number of non-zero entries in each row of the design matrix and $\beta^*$ is the coefficient of the linear predictor. We also show that ordinary least squares or ridge regression applied to the reduced data can in fact allow us fit more flexible models. We obtain non-asymptotic prediction error bounds for interaction models and for models where an unknown row normalisation must be applied in order for the signal to be linear in the predictors.

artificial intelligence, machine learning, regression, (17 more...)

arXiv.org Machine Learning

1308.1269

Country: Europe (0.28)

Genre: Research Report > New Finding (0.46)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.69)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.49)

Add feedback