AITopics

1605.02234

Country: North America > United States (0.46)

Genre: Research Report > New Finding (0.93)

Industry:

Health & Medicine > Health Care Technology (1.00)
Health & Medicine > Diagnostic Medicine > Imaging (1.00)
Health & Medicine > Therapeutic Area > Neurology > Alzheimer's Disease (0.87)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (1.00)

#artificialintelligenceOct-16-2016, 18:30:27 GMT

Machine Learning Done Wrong

In engineering, there are various ways to build a key-value storage, and each design makes a different set of assumptions about the usage pattern. In statistical modeling, there are various algorithms to build a classifier, and each algorithm makes a different set of assumptions about the data. When dealing with small amounts of data, it's reasonable to try as many algorithms as possible and to pick the best one since the cost of experimentation is low. But as we hit "big data", it pays off to analyze the data upfront and then design the modeling pipeline (pre-processing, modeling, optimization algorithm, evaluation, productionization) accordingly. As pointed out in my previous post, there are dozens of ways to solve a given modeling problem. Each model assumes something different, and it's not obvious how to navigate and identify which assumptions are reasonable.

artificial intelligence, coefficient, machine learning, (12 more...)

Industry: Law Enforcement & Public Safety > Fraud (0.34)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.33)

#artificialintelligenceOct-16-2016, 15:01:54 GMT

From both sides now: the math of linear regression ·

Linear regression is the most basic and the most widely used technique in machine learning; yet for all its simplicity, studying it can unlock some of the most important concepts in statistics. If you have a basic undestanding of linear regression expressed as \hat{Y} \theta_0 \theta_1X, but don't have a background in statistics and find statements like "ridge regression is equivalent to the maximum a posteriori (MAP) estimate with a zero-mean Gaussian prior" bewildering, then this post is for you. With a superficial goal of understanding that somewhat obtuse statement, its main objective is to explore the topic, starting from the standard formulation of linear regression, moving on to the probabilistic approach (maximum likelihood formulation) and from there to Bayesian linear regression. I'll use the \theta character throughout to refer to the coefficients (weights) of a regression model, either explicitly broken out as \theta_0 and \theta_1 for intercept and slope respectively, or just \theta referring to the vector of coefficients. I'll usually use the expression \theta Tx_i for the prediction a model gives at x_i, the assumption being that a 1 has been added to the vector of values at x_i . 1 In the single predictor case, we know that the least squares fit is the line that minimizes the sum of the squared distances between observed data and predicted values, i.e. it minimizes the Residual Sum of Squares (RSS): These residuals are pretty important in how we reason about our model.

#artificialintelligenceOct-16-2016, 13:15:41 GMT

sparklyr -- R interface for Apache Spark

H2O Sparkling Water supports a wide array of algorithms, and as illustrated above it's easy to chain these functions together with dplyr pipelines. To learn more see the H2O Sparkling Water section.

artificial intelligence, machine learning, partition, (6 more...)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.34)

#artificialintelligenceOct-16-2016, 02:31:35 GMT

Gentlest Intro to TensorFlow #3: Matrices & Multi-feature Linear Regression – All of us are belong to machines

Summary: With concepts of single-feature linear-regression, cost function, gradient descent (from Part 1), epoch, learn-rate, gradient descent variation (from Part 2) under our belt, we are ready to progress to multi-feature linear regression with TensorFlow (TF). If you are already familiar with matrices and multi-feature linear regression, skip to the end for the multi-feature Tensorflow code cheatsheet, or even skip this entire article. The premise of the previous articles was: given any house size (square meters/sqm), which is the feature, we want to predict the house price (), the outcome. In reality, any prediction relies on multiple features, so we advance from single-feature to 2-feature linear regression; we chose 2 features to keep visualization and comprehension simple, but the concept generalizes to any number of features. We introduce a new feature, 'Rooms' (number of units in the house).

artificial intelligence, linear regression, machine learning, (17 more...)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (1.00)

#artificialintelligenceOct-16-2016, 00:46:11 GMT

Interpreting the results of linear regression – EFavDB

The full code is available as an IPython notebook on github. Assuming a multivariate normal distribution for the residuals in linear regression allows us to construct test statistics and therefore specify uncertainty in our fits. A t-test judges the explanatory power of a predictor in isolation, although the standard error that appears in the calculation of the t-statistic is a function of the other predictors in the model. On the other hand, an F-test is a global test that judges the explanatory power of all the predictors together, and we've seen that parsimony in choosing predictors can improve the quality of the overall regression. We've also seen that multicollinearity can throw off the results of individual t-tests as well as obscure the interpretation of the signs of the fitted coefficients. A symptom of multicollinearity is when none of the individual coefficients are significant but the overall F-test is significant.

artificial intelligence, eqnarray, machine learning, (19 more...)

Genre: Research Report > Experimental Study (0.73)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.73)

#artificialintelligenceOct-14-2016, 00:20:59 GMT

How to Scale Machine Learning Data From Scratch With Python - Machine Learning Mastery

Many machine learning algorithms expect data to be scaled consistently. There are two popular methods that you should consider when scaling your data for machine learning. In this tutorial, you will discover how you can rescale your data for machine learning. How To Prepare Machine Learning Data From Scratch With Python Photo by Ondra Chotovinsky, some rights reserved. Many machine learning algorithms expect the scale of the input and even the output data to be equivalent. It can help in methods that weight inputs in order to make a prediction, such as in linear regression and logistic regression.

artificial intelligence, dataset, machine learning, (14 more...)

Genre: Instructional Material > Course Syllabus & Notes (0.35)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.55)

Fayad, Ibrahim, Baghdadi, Nicolas, Guitet, Stéphane, Bailly, Jean-Stéphane, Hérault, Bruno, Gond, Valéry, Hajj, Mahmoud, Minh, Dinh Ho Tong

Aboveground biomass mapping in French Guiana by combining remote sensing, forest inventories and environmental data

arXiv.org Machine LearningOct-14-2016

Mapping forest aboveground biomass (AGB) has become an important task, particularly for the reporting of carbon stocks and changes. AGB can be mapped using synthetic aperture radar data (SAR) or passive optical data. However, these data are insensitive to high AGB levels (\textgreater{}150 Mg/ha, and \textgreater{}300 Mg/ha for P-band), which are commonly found in tropical forests. Studies have mapped the rough variations in AGB by combining optical and environmental data at regional and global scales. Nevertheless, these maps cannot represent local variations in AGB in tropical forests. In this paper, we hypothesize that the problem of misrepresenting local variations in AGB and AGB estimation with good precision occurs because of both methodological limits (signal saturation or dilution bias) and a lack of adequate calibration data in this range of AGB values. We test this hypothesis by developing a calibrated regression model to predict variations in high AGB values (mean \textgreater{}300 Mg/ha) in French Guiana by a methodological approach for spatial extrapolation with data from the optical geoscience laser altimeter system (GLAS), forest inventories, radar, optics, and environmental variables for spatial inter-and extrapolation. Given their higher point count, GLAS data allow a wider coverage of AGB values. We find that the metrics from GLAS footprints are correlated with field AGB estimations (R 2 =0.54, RMSE=48.3 Mg/ha) with no bias for high values. First, predictive models, including remote-sensing, environmental variables and spatial correlation functions, allow us to obtain "wall-to-wall" AGB maps over French Guiana with an RMSE for the in situ AGB estimates of ~51 Mg/ha and R${}^2$=0.48 at a 1-km grid size. We conclude that a calibrated regression model based on GLAS with dependent environmental data can produce good AGB predictions even for high AGB values if the calibration data fit the AGB range. We also demonstrate that small temporal and spatial mismatches between field data and GLAS footprints are not a problem for regional and global calibrated regression models because field data aim to predict large and deep tendencies in AGB variations from environmental gradients and do not aim to represent high but stochastic and temporally limited variations from forest dynamics. Thus, we advocate including a greater variety of data, even if less precise and shifted, to better represent high AGB values in global models and to improve the fitting of these models for high values.

agb estimate, artificial intelligence, upstream oil & gas, (17 more...)

doi: 10.1016/j.jag.2016.07.015

1610.04371

Country:

South America > French Guiana (0.93)
North America > United States (0.67)
Africa (0.28)
(2 more...)

Genre: Research Report (1.00)

Industry:

Energy > Oil & Gas > Upstream (0.93)
Energy > Renewable > Geothermal > Geothermal Energy Exploration and Development > Geophysical Analysis & Survey (0.73)

Technology:

Information Technology > Data Science (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (1.00)

Sourati, Jamshid, Akcakaya, Murat, Leen, Todd K., Erdogmus, Deniz, Dy, Jennifer G.

Asymptotic Analysis of Objectives based on Fisher Information in Active Learning

arXiv.org Machine LearningOct-14-2016

Obtaining labels can be costly and time-consuming. Active learning allows a learning algorithm to intelligently query samples to be labeled for efficient learning. Fisher information ratio (FIR) has been used as an objective for selecting queries in active learning. However, little is known about the theory behind the use of FIR for active learning. There is a gap between the underlying theory and the motivation of its usage in practice. In this paper, we attempt to fill this gap and provide a rigorous framework for analyzing existing FIR-based active learning methods. In particular, we show that FIR can be asymptotically viewed as an upper bound of the expected variance of the log-likelihood ratio. Additionally, our analysis suggests a unifying framework that not only enables us to make theoretical comparisons among the existing querying methods based on FIR, but also allows us to give insight into the development of new active learning approaches based on this objective.

artificial intelligence, bayesian inference, machine learning, (17 more...)

1605.08798

Genre:

Research Report (0.64)
Overview (0.46)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.93)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.67)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.67)
(2 more...)

Zhu, Yinchu, Bradic, Jelena

Two-sample testing in non-sparse high-dimensional linear models

arXiv.org Machine LearningOct-14-2016

In analyzing high-dimensional models, sparsity of the model parameter is a common but often undesirable assumption. In this paper, we study the following two-sample testing problem: given two samples generated by two high-dimensional linear models, we aim to test whether the regression coefficients of the two linear models are identical. We propose a framework named TIERS (short for TestIng Equality of Regression Slopes), which solves the two-sample testing problem without making any assumptions on the sparsity of the regression parameters. TIERS builds a new model by convolving the two samples in such a way that the original hypothesis translates into a new moment condition. A self-normalization construction is then developed to form a moment test. We provide rigorous theory for the developed framework. Under very weak conditions of the feature covariance, we show that the accuracy of the proposed test in controlling Type I errors is robust both to the lack of sparsity in the features and to the heavy tails in the error distribution, even when the sample size is much smaller than the feature dimension. Moreover, we discuss minimax optimality and efficiency properties of the proposed test. Simulation analysis demonstrates excellent finite-sample performance of our test. In deriving the test, we also develop tools that are of independent interest. The test is built upon a novel estimator, called Auto-aDaptive Dantzig Selector (ADDS), which not only automatically chooses an appropriate scale of the error term but also incorporates prior information. To effectively approximate the critical value of the test statistic, we develop a novel high-dimensional plug-in approach that complements the recent advances in Gaussian approximation theory.

artificial intelligence, bradic high-dimensional two-sample testing, machine learning, (15 more...)

1610.0458

Country: North America > United States > California (0.28)

Genre: Research Report > Experimental Study (0.46)

Industry: Health & Medicine > Therapeutic Area (0.45)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.34)