AITopics

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.77)

#artificialintelligenceAug-3-2018, 18:43:04 GMT

Machine Learning Results in R: one plot to rule them all! (Part 2 – Regression Models)

Given the number of people interested in my first post for visualizing Classification Models Results, I've decided to create and share some new function to visualize and compare whole Linear Regression Models with one line of code. These plots will help us with our time invested in model selection and a general understanding of our results. Where are we going with this post? Let's take a quick look at the final output: a quick nice dashboard with everything you'd need to compare and evaluate if your regression model is looking good, compare with others, or get working on further improvements. Interesting to say that, the exact same function mplot_full used before in the Part 1 – Classification Models post, will work on Regressions too lares::updateLares().

artificial intelligence, machine learning, regression, (7 more...)

Country: North America > United States > Pennsylvania (0.05)

Genre: Research Report (0.35)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (1.00)

arXiv.org Machine LearningAug-2-2018

The impact of imbalanced training data on machine learning for author name disambiguation

Kim, Jinseok, Kim, Jenna

In supervised machine learning for author name disambiguation, negative training data are often dominantly larger than positive training data. This paper examines how the ratios of negative to positive training data can affect the performance of machine learning algorithms to disambiguate author names in bibliographic records. On multiple labeled datasets, three classifiers - Logistic Regression, Na\"ive Bayes, and Random Forest - are trained through representative features such as coauthor names, and title words extracted from the same training data but with various positive-negative training data ratios. Results show that increasing negative training data can improve disambiguation performance but with a few percent of performance gains and sometimes degrade it. Logistic Regression and Na\"ive Bayes learn optimal disambiguation models even with a base ratio (1:1) of positive and negative training data. Also, the performance improvement by Random Forest tends to quickly saturate roughly after 1:10 ~ 1:15. These findings imply that contrary to the common practice using all training data, name disambiguation algorithms can be trained using part of negative training data without degrading much disambiguation performance while increasing computational efficiency. This study calls for more attention from author name disambiguation scholars to methods for machine learning from imbalanced data.

artificial intelligence, machine learning, training data, (13 more...)

doi: 10.1007/s11192-018-2865-9

1808.00525

Country:

North America > United States > Pennsylvania (0.04)
North America > United States > New York > Onondaga County > Syracuse (0.04)
North America > United States > New Mexico > Santa Fe County > Santa Fe (0.04)
(4 more...)

Genre: Research Report > New Finding (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.70)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.70)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Clustering (0.68)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.48)

#artificialintelligenceAug-1-2018, 14:49:00 GMT

An Introduction to Applied Machine Learning with Multiple Linear Regression and Python

The purpose of this post is to unpack to the layman the basic concepts of applied machine learning and to document how data scientists or data analysts would generally answer a question or solve a problem with data and machine learning algorithms. Hopefully, by the end, you would have a more solid understanding of the steps your data scientist or business intelligence officers should be going through when attempting to apply the power of machine learning to data. Machine learning is a method of data analysis that automates analytical model building. The steps illustrated here are written as a'practical guide' of that method. It covers the broad strokes of the process one would go through when implementing any other similar machine learning algorithms or ideas.

artificial intelligence, linear regression, machine learning, (11 more...)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.57)

#artificialintelligenceAug-1-2018, 08:00:14 GMT

Coding Deep Learning for Beginners -- Linear Regression (Part 1): Initialization and Prediction

This is the 3rd article of series "Coding Deep Learning for Beginners". Here, you will be able to find links to all articles, agenda, and general information about an estimated release date of next articles on the bottom of the 1st article. They are also available in my open source portfolio -- MyRoadToAI, along with some mini-projects, presentations, tutorials and links. You can also read the article on my personal website, hosted with Jekyll in order to improve readability (supporting code syntax highlighting, LaTeX equations and more. Some of you may wonder, why the article series about explaining and coding Neural Networks starts with basic Machine Learning algorithm such as Linear Regression.

artificial intelligence, linear regression, machine learning, (12 more...)

Country: Europe > Poland > Lesser Poland Province > Kraków (0.07)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.94)

#artificialintelligenceJul-31-2018, 13:56:19 GMT

Different methods of feature selection

In our previous post, we discussed what is feature selection and why we need feature selection. In this post, we're going to look at the different methods used in feature selection. There are three main classification of feature selection methods – Filter Methods, Wrapper Methods, and Embedded Methods. We'll look at all of them individually. Filter methods are learning-algorithm-agnostic, which means they can be employed no matter which learning algorithm you're using.

artificial intelligence, feature selection, machine learning, (10 more...)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.33)

#artificialintelligenceJul-31-2018, 13:55:18 GMT

Using deep learning to predict emergency room visits

At IBM Research, we are exploring new solutions for a range of health care challenges. One such challenge is emergency room (ER) overcrowding, which can lead to long wait times for treatment. Patients who use the ER for non-emergency situations are more likely to return to the ER multiple times (Poole et al. 2016), further contributing to overcrowding. Identifying those patients who are likely to return to the ER may enable hospitals to intervene to ensure access to necessary care outside the ER and potentially alleviate overcrowding. My team at IBM Research-China took on this challenge.

artificial intelligence, machine learning, predict emergency room visit, (12 more...)

Country:

Asia > China (0.29)
Europe > Sweden (0.06)

Industry: Health & Medicine > Health Care Providers & Services (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.53)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.34)

Kia, Seyed Mostafa, Beckmann, Christian F., Marquand, Andre F.

Scalable Multi-Task Gaussian Process Tensor Regression for Normative Modeling of Structured Variation in Neuroimaging Data

arXiv.org Machine LearningJul-31-2018

Most brain disorders are very heterogeneous in terms of their underlying biology and developing analysis methods to model such heterogeneity is a major challenge. A promising approach is to use probabilistic regression methods to estimate normative models of brain function using (f)MRI data then use these to map variation across individuals in clinical populations (e.g., via anomaly detection). To fully capture individual differences, it is crucial to statistically model the patterns of correlation across different brain regions and individuals. However, this is very challenging for neuroimaging data because of high-dimensionality and highly structured patterns of correlation across multiple axes. Here, we propose a general and flexible multi-task learning framework to address this problem. Our model uses a tensor-variate Gaussian process in a Bayesian mixed-effects model and makes use of Kronecker algebra and a low-rank approximation to scale efficiently to multi-way neuroimaging data at the whole brain level. On a publicly available clinical fMRI dataset, we show that our computationally affordable approach substantially improves detection sensitivity over both a mass-univariate normative model and a classifier that --unlike our approach-- has full access to the clinical labels.

artificial intelligence, data mining, machine learning, (12 more...)

1808.00036

Country: Europe (0.67)

Genre: Research Report (1.00)

Industry:

Health & Medicine > Therapeutic Area > Neurology (1.00)
Health & Medicine > Health Care Technology (1.00)

Technology:

Information Technology > Data Science > Data Mining (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.34)

Friedberg, Rina, Tibshirani, Julie, Athey, Susan, Wager, Stefan

Local Linear Forests

arXiv.org Machine LearningJul-30-2018

Random forests are a powerful method for non-parametric regression, but are limited in their ability to fit smooth signals, and can show poor predictive performance in the presence of strong, smooth effects. Taking the perspective of random forests as an adaptive kernel method, we pair the forest kernel with a local linear regression adjustment to better capture smoothness. The resulting procedure, local linear forests, enables us to improve on asymptotic rates of convergence for random forests with smooth signals, and provides substantial gains in accuracy on both real and simulated data.

artificial intelligence, machine learning, random forest, (15 more...)

1807.11408

Country:

Europe > Austria > Vienna (0.14)
North America > United States > California > Santa Clara County > Palo Alto (0.04)
North America > United States > New York (0.04)
(2 more...)

Genre: Research Report > Experimental Study (0.93)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Decision Tree Learning (0.94)

Yan, Hao, Paynabar, Kamran, Pacella, Massimo

Structured Point Cloud Data Analysis via Regularized Tensor Regression for Process Modeling and Optimization

arXiv.org Machine LearningJul-30-2018

Modern measurement technologies provide the means to measure high density spatial and geometric data in three-dimensional (3D) coordinate systems, referred to as point clouds. Point cloud data analysis has broad applications in advanced manufacturing and metrology for measuring dimensional accuracy and shape analysis, in geographic information systems (GIS) for digital elevation modeling and analysis of terrains, in computer graphics for shape reconstruction, and in medical imaging for volumetric measurement to name a few. The role of point cloud data in manufacturing is now more important than ever, particularly in the field of smart and additive manufacturing processes, where products with complex shape and geometry are manufactured with the help of advanced technologies (Gibson et al., 2010). In these processes, the dimensional and geometric accuracy of manufactured parts are measured in the form of point clouds using modern sensing devices, including touch-probe coordinate measuring machines (CMM) and optical systems, such as laser scanners. Modeling the relationship of the dimensional accuracy, encapsulated in point clouds, with process parameters and machine settings is vital for variation reduction and process optimization.

artificial intelligence, cloud computing, machine learning, (17 more...)

1807.10278

Country:

Africa > Senegal > Kolda Region > Kolda (0.04)
North America > United States > Georgia > Fulton County > Atlanta (0.04)
North America > United States > Arizona > Maricopa County > Tempe (0.04)
Europe > Italy (0.04)

Genre: Research Report (1.00)

Industry:

Health & Medicine > Health Care Technology (0.48)
Health & Medicine > Diagnostic Medicine > Imaging (0.48)

Technology:

Information Technology > Cloud Computing (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Vision (0.86)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.69)