AITopics

Genre: Instructional Material > Course Syllabus & Notes (0.59)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.95)

arXiv.org Machine LearningDec-4-2017

Saturating Splines and Feature Selection

Boyd, Nicholas, Hastie, Trevor, Boyd, Stephen, Recht, Benjamin, Jordan, Michael

We extend the adaptive regression spline model by incorporating saturation, the natural requirement that a function extend as a constant outside a certain range. We fit saturating splines to data using a convex optimization problem over a space of measures, which we solve using an efficient algorithm based on the conditional gradient method. Unlike many existing approaches, our algorithm solves the original infinite-dimensional (for splines of degree at least two) optimization problem without pre-specified knot locations. We then adapt our algorithm to fit generalized additive models with saturating splines as coordinate functions and show that the saturation requirement allows our model to simultaneously perform feature selection and nonlinear function fitting. Finally, we briefly sketch how the method can be extended to higher order splines and to different requirements on the extension outside the data range.

artificial intelligence, machine learning, spline, (16 more...)

1609.06764

Country: North America > United States (0.93)

Genre: Research Report (0.50)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.66)

Wu, Anqi, Koyejo, Oluwasanmi, Pillow, Jonathan W.

Dependent relevance determination for smooth and structured sparse regression

arXiv.org Machine LearningDec-4-2017

In many problem settings, parameter vectors are not merely sparse, but dependent in such a way that non-zero coefficients tend to cluster together. We refer to this form of dependency as "region sparsity". Classical sparse regression methods, such as the lasso and automatic relevance determination (ARD), which model parameters as independent a priori, and therefore do not exploit such dependencies. Here we introduce a hierarchical model for smooth, region-sparse weight vectors and tensors in a linear regression setting. Our approach represents a hierarchical extension of the relevance determination framework, where we add a transformed Gaussian process to model the dependencies between the prior variances of regression weights. We combine this with a structured model of the prior variances of Fourier coefficients, which eliminates unnecessary high frequencies. The resulting prior encourages weights to be region-sparse in two different bases simultaneously. We develop Laplace approximation and Monte Carlo Markov Chain (MCMC) sampling to provide efficient inference for the posterior. Furthermore, a two-stage convex relaxation of the Laplace approximation approach is also provided to relax the inevitable non-convexity during the optimization. We finally show substantial improvements over comparable methods for both simulated and real datasets from brain imaging.

artificial intelligence, machine learning, relevance determination, (16 more...)

1711.10058

Country: North America > United States (1.00)

Genre: Research Report (0.82)

Industry:

Health & Medicine > Therapeutic Area > Neurology (1.00)
Health & Medicine > Health Care Technology (1.00)
Health & Medicine > Diagnostic Medicine > Imaging (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.68)

#artificialintelligenceDec-3-2017, 19:35:43 GMT

Intuition of applying PCA before logistic regression

I came across this paragraph about logistic regression with PCA in Kevin P Murphy's book on Machine Learning. If we use PCA first, then use logistic regression afterwards, although overall, this is still representable as a logistic regression problem, the problem is constrained since we have forced linear regression to work in the subspace spanned by the PCA vectors. Consider 100 training vectors randomly positioned in a 1000 dimensional space each with a random class 0 or 1. With very high probability, these 100 vectors will be linearly separable. Now project these vectors onto a 10 dimensional space: with very high probability, 100 vectors plotted in a 10 dimensional space will not be linearly separable.

artificial intelligence, logistic regression, machine learning, (7 more...)

Genre:

Research Report > New Finding (1.00)
Research Report > Experimental Study (1.00)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (1.00)

@machinelearnbotDec-3-2017, 11:35:12 GMT

Using TensorFlow for Predictive Analytics with Linear Regression

Since its release in 2015 by the Google Brain team, TensorFlow has been a driving force in conversations centered on artificial intelligence, machine learning, and predictive analytics. With its flexible architecture, TensorFlow provides numerical computation capacity with incredible parallelism that is appealing to both small and large businesses. TensorFlow, being built on stateful dataflow graphs across multiple systems, allows for parallel processing--data to be leveraged in a meaningful way without requiring petabytes of data. To demonstrate how you can take advantage of TensorFlow without having huge silos of data on hand, I'll explain how to use TensorFlow to build a linear regression model in this post. Linear modeling is a relatively simplistic type of mathematical method that, when used properly, can help predict modeled behavior.

artificial intelligence, machine learning, tensorflow, (9 more...)

@machinelearnbot

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (1.00)

arXiv.org Machine LearningDec-3-2017

Gaussian Process Regression for Arctic Coastal Erosion Forecasting

Kupilik, Matthew, Witmer, Frank, MacLeod, Euan-Angus, Wang, Caixia, Ravens, Tom

Arctic coastal morphology is governed by multiple factors, many of which are affected by climatological changes. As the season length for shorefast ice decreases and temperatures warm permafrost soils, coastlines are more susceptible to erosion from storm waves. Such coastal erosion is a concern, since the majority of the population centers and infrastructure in the Arctic are located near the coasts. Stakeholders and decision makers increasingly need models capable of scenario-based predictions to assess and mitigate the effects of coastal morphology on infrastructure and land use. Our research uses Gaussian process models to forecast Arctic coastal erosion along the Beaufort Sea near Drew Point, AK. Gaussian process regression is a data-driven modeling methodology capable of extracting patterns and trends from data-sparse environments such as remote Arctic coastlines. To train our model, we use annual coastline positions and near-shore summer temperature averages from existing datasets and extend these data by extracting additional coastlines from satellite imagery. We combine our calibrated models with future climate models to generate a range of plausible future erosion scenarios. Our results show that the Gaussian process methodology substantially improves yearly predictions compared to linear and nonlinear least squares methods, and is capable of generating detailed forecasts suitable for use by decision makers.

artificial intelligence, machine learning, modeling & simulation, (18 more...)

1712.00867

Country: North America > United States > Alaska > Anchorage Municipality > Anchorage (0.14)

Genre: Research Report > New Finding (0.86)

Industry: Energy > Renewable > Geothermal > Geothermal Energy Exploration and Development > Geophysical Analysis & Survey (0.35)

Technology:

Information Technology > Modeling & Simulation (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.49)

#artificialintelligenceDec-1-2017, 20:40:46 GMT

Different types of Machine Learning Algorithms

With the emergence of programming tools like Python and R language, Machine learning has become the fastest growing field in the recent years. Research indicates that the machine learning will replace around 25% of jobs worldwide in the next 10 years. With Big data and Data scientists, machine learning will only gain further momentum. Machine learning is purely based on algorithms and we are going to explain the most important algorithms in detail below. All the Machine learning algorithms can be broadly classified into 3 main categories namely Supervised learning: The input parameters and the output goals are well predefined.

artificial intelligence, categorical response, machine learning algorithm, (4 more...)

Genre: Research Report > New Finding (0.38)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.40)

Hughes, Michael C., Hope, Gabriel, Weiner, Leah, McCoy, Thomas H., Perlis, Roy H., Sudderth, Erik B., Doshi-Velez, Finale

Prediction-Constrained Topic Models for Antidepressant Recommendation

arXiv.org Machine LearningDec-1-2017

Supervisory signals can help topic models discover low-dimensional data representations that are more interpretable for clinical tasks. We propose a framework for training supervised latent Dirichlet allocation that balances two goals: faithful generative explanations of high-dimensional data and accurate prediction of associated class labels. Existing approaches fail to balance these goals by not properly handling a fundamental asymmetry: the intended task is always predicting labels from data, not data from labels. Our new prediction-constrained objective trains models that predict labels from heldout data well while also producing good generative likelihoods and interpretable topic-word parameters. In a case study on predicting depression medications from electronic health records, we demonstrate improved recommendations compared to previous supervised topic models and high- dimensional logistic regression from words alone.

machine learning, natural language, prediction, (17 more...)

1712.00499

Country: North America > United States (0.46)

Genre: Research Report > New Finding (0.66)

Industry:

Health & Medicine > Therapeutic Area > Psychiatry/Psychology (1.00)
Health & Medicine > Therapeutic Area > Immunology (1.00)
Health & Medicine > Pharmaceuticals & Biotechnology (1.00)
(4 more...)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Discourse & Dialogue (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.47)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.34)

@machinelearnbotNov-30-2017, 18:43:07 GMT

Tree Boosting With XGBoost -- Why Does XGBoost Win "Every" Machine Learning Competition?

Tree boosting has empirically proven to be efficient for predictive mining for both classification and regression. For many years, MART (multiple additive regression trees) has been the tree boosting method of choice. But a starting from 2015, a first to try, always winning algorithm surged to the surface: XGBoost. This algorithm re-implements the tree boosting and gained popularity by winning Kaggle and other data science competition. The paper introduce in first place the supervised learning task and discuss the model selection techniques.

artificial intelligence, machine learning, xgboost, (16 more...)

@machinelearnbot

Country: North America > United States > New York (0.14)

Genre: Contests & Prizes (0.34)

Industry: Education (0.34)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Ensemble Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.49)

#artificialintelligenceNov-30-2017, 11:45:24 GMT

Implementing Machine Learning Using Python and Scikit-learn

For machine learning, you can also use these libraries to build learning models. However, doing so requires that you have a strong appreciation of the mathematical foundation for the various machine learning algorithms.

artificial intelligence, dataset, machine learning, (13 more...)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.33)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.33)