AITopics

2003.08259

Country:

North America > United States > Pennsylvania > Philadelphia County > Philadelphia (0.04)
North America > United States > New York > New York County > New York City (0.04)
North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)
(2 more...)

Genre: Research Report > New Finding (0.74)

Industry: Information Technology (0.66)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.71)

#artificialintelligenceMar-17-2020, 06:11:20 GMT

To deep, or not to deep, that is the question!

As in other fields of artificial intelligence and prior to the emergence of Deep Learning, especially deep neural networks, artificial vision research was focused on a traditional Machine Learning approach. The traditional machine learning approach relies on developers massaging the data to extract the most salient or significant aspects from the data they are dealing with; that is, time sequences of frames, or videos. In this case, both scientific research and application development have been centered around identifying the most significant image elements that would allow, for example, facial and body recognition of the people who appear in the images, tracking them from one frame to another, or classifying the vehicles that move through a given area. After extracting this meaningful data, statistical methods are then employed to transform the representation into a so-called "understanding" of the real visual environment by using clustering, support-vector machines (SVMs), and filtering algorithms (linear, non-linear, regression), among others. This means that the merits of any given application lie in how well researchers and developers are able to source and generate data from the raw processed frames and transform it into useful structured data.

characterization, developer

#artificialintelligence

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.96)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Support Vector Machines (0.60)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.60)

Statistically Guided Divide-and-Conquer for Sparse Factorization of Large Matrix

Chen, Kun, Dong, Ruipeng, Xu, Wanwan, Zheng, Zemin

The sparse factorization of a large matrix is fundamental in modern statistical learning. In particular, the sparse singular value decomposition and its variants have been utilized in multivariate regression, factor analysis, biclustering, vector time series modeling, among others. The appeal of this factorization is owing to its power in discovering a highly-interpretable latent association network, either between samples and variables or between responses and predictors. However, many existing methods are either ad hoc without a general performance guarantee, or are computationally intensive, rendering them unsuitable for large-scale studies. We formulate the statistical problem as a sparse factor regression and tackle it with a divide-and-conquer approach. In the first stage of division, we consider both sequential and parallel approaches for simplifying the task into a set of co-sparse unit-rank estimation (CURE) problems, and establish the statistical underpinnings of these commonly-adopted and yet poorly understood deflation methods. In the second stage of division, we innovate a contended stagewise learning technique, consisting of a sequence of simple incremental updates, to efficiently trace out the whole solution paths of CURE. Our algorithm has a much lower computational complexity than alternating convex search, and the choice of the step size enables a flexible and principled tradeoff between statistical accuracy and computational efficiency. Our work is among the first to enable stagewise learning for non-convex problems, and the idea can be applicable in many multi-convex problems. Extensive simulation studies and an application in genetics demonstrate the effectiveness and scalability of our approach.

condition 2, matrix, stagewise, (16 more...)

2003.07898

Country:

North America > United States > Connecticut > Tolland County > Storrs (0.14)
North America > United States > New York (0.04)
North America > United States > Montana (0.04)
(4 more...)

Genre: Research Report > New Finding (0.45)

Industry: Health & Medicine > Pharmaceuticals & Biotechnology (1.00)

Technology:

Information Technology > Data Science > Data Mining (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.88)

Sah, Ramesh Kumar, Ghasemzadeh, Hassan

Adversarial Transferability in Wearable Sensor Systems

Machine learning has increasingly become the most used approach for inference and decision making in wearable sensor systems. However, recent studies have found that machine learning systems are easily fooled by the addition of adversarial perturbation to their inputs. What is more interesting is that the adversarial examples generated for one machine learning system can also degrade the performance of another. This property of adversarial examples is called transferability. In this work, we take the first strides in studying adversarial transferability in wearable sensor systems, from the following perspectives: 1) Transferability between machine learning models, 2) Transferability across subjects, 3) Transferability across sensor locations, and 4) Transferability across datasets. With Human Activity Recognition (HAR) as an example sensor system, we found strong untargeted transferability in all cases of transferability. Specifically, gradient-based attacks were able to achieve higher misclassification rates compared to non-gradient attacks. The misclassification rate of untargeted adversarial examples ranged from 20% to 98%. For targeted transferability between machine learning models, the success rate of adversarial examples was 100% for iterative attack methods. However, the success rate for other types of targeted transferability ranged from 20% to 0%. Our findings strongly suggest that adversarial transferability has serious consequences not only in sensor systems but also across the broad spectrum of ubiquitous computing.

adversarial example, adversarial transferability, transferability, (13 more...)

2003.07982

Country:

North America > United States > New York > New York County > New York City (0.04)
North America > United States > Washington > Whitman County > Pullman (0.04)
North America > United States > California > Alameda County > Berkeley (0.04)
(2 more...)

Genre: Research Report > New Finding (1.00)

Industry:

Information Technology (0.69)
Health & Medicine > Consumer Health (0.46)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.57)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.46)

Aoki, Raquel, Ester, Martin

ParKCa: Causal Inference with Partially Known Causes

Causal Inference methods based on observational data are an alternative for applications where collecting the counterfactual data or realizing a more standard experiment is not possible. In this work, our goal is to combine several observational causal inference methods to learn new causes in applications where some causes are well known. We validate the proposed method on The Cancer Genome Atlas (TCGA) dataset to identify genes that potentially cause metastasis.

application, causal model, dataset, (13 more...)

2003.07952

Country: North America > Canada (0.04)

Genre:

Research Report > Experimental Study (0.68)
Research Report > New Finding (0.46)

Industry:

Health & Medicine > Therapeutic Area > Oncology (1.00)
Health & Medicine > Pharmaceuticals & Biotechnology (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.68)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.47)

Improving predictions by nonlinear regression models from outlying input data

Hsieh, William W.

When applying machine learning/statistical methods to the environmental sciences, nonlinear regression (NLR) models often perform only slightly better and occasionally worse than linear regression (LR). The proposed reason for this conundrum is that NLR models can give predictions much worse than LR when given input data which lie outside the domain used in model training. Continuous unbounded variables are widely used in environmental sciences, whence not uncommon for new input data to lie far outside the training domain. For six environmental datasets, inputs in the test data were classified as "outliers" and "non-outliers" based on the Mahalanobis distance from the training input data. The prediction scores (mean absolute error, Spearman correlation) showed NLR to outperform LR for the non-outliers, but often underperform LR for the outliers. An approach based on Occam's Razor (OR) was proposed, where linear extrapolation was used instead of nonlinear extrapolation for the outliers. The linear extrapolation to the outlier domain was based on the NLR model within the non-outlier domain. This NLR$_{\mathrm{OR}}$ approach reduced occurrences of very poor extrapolation by NLR, and it tended to outperform NLR and LR for the outliers. In conclusion, input test data should be screened for outliers. For outliers, the unreliable NLR predictions can be replaced by NLR$_{\mathrm{OR}}$ or LR predictions, or by issuing a "no reliable prediction" warning.

extrapolation, outlier, training data, (15 more...)

2003.07926

Country:

North America > Canada > Ontario (0.14)
Asia > China > Beijing > Beijing (0.05)
North America > United States > New York (0.04)
(6 more...)

Genre: Research Report (1.00)

Industry:

Transportation > Air (0.68)
Transportation > Infrastructure & Services > Airport (0.46)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (1.00)

#artificialintelligenceMar-16-2020, 01:45:16 GMT

XAI : The 3rd wave of AI- The Machine reveals WHY?

KernelExplainer:- Kernel SHAP is a method that uses a special weighted linear regression to compute the importance of each feature.

algorithm, lime, prediction, (16 more...)

#artificialintelligence

Country:

North America > United States > California (0.04)
Asia > Singapore (0.04)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Decision Tree Learning (0.71)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.36)

Shehhi, Maryam R. Al, Kaya, Abdullah

Time series and machine learning to forecast the water quality from satellite data

arXiv.org Machine LearningMar-16-2020

Managing the quality of water for present and future generations of coastal regions should be a central concern of both citizens and public officials. Remote sensing can contribute to the management and monitoring of coastal water and pollutants. Algal blooms are a coastal pollutant that is a cause of concern. Many satellite data, such as MODIS, have been used to generate water-quality products to detect the blooms such as chlorophyll a (Chl-a), a photosynthesis index called fluorescence line height (FLH), and sea surface temperature (SST). It is important to characterize the spatial and temporal variations of these water quality products by using the mathematical models of these products. However, for monitoring, pollution control boards will need nowcasts and forecasts of any pollution. Therefore, we aim to predict the future values of the MODIS Chl-a, FLH, and SST of the water. This will not be limited to one type of water but, rather, will cover different types of water varying in depth and turbidity. This is very significant because the temporal trend of Chl-a, FLH, and SST is dependent on the geospatial and water properties. For this purpose, we will decompose the time series of each pixel into several components: trend, intra-annual variations, seasonal cycle, and stochastic stationary. We explore three such time series machine learning models that can characterize the non-stationary time series data and predict future values, including the Seasonal ARIMA (Auto Regressive Integrated Moving Average) (SARIMA), regression, and neural network. The results indicate that all these methods are effective at modelling Chl-a, FLH, and SST time series and predicting the values reasonably well. However, regression and neural network are found to be the best at predicting Chl-a in all types of water (turbid and shallow). Meanwhile, the SARIMA model provides the best prediction of FLH and SST.

chl-a, neural network, sarima, (13 more...)

2003.11923

Country:

North America > Trinidad and Tobago > Trinidad > Arima > Arima (0.24)
Indian Ocean > Arabian Gulf (0.15)
Asia > Middle East > Saudi Arabia > Arabian Gulf (0.15)
(16 more...)

Genre: Research Report (0.82)

Industry: Water & Waste Management > Water Management > Water Supplies & Services (0.83)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.90)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.69)

Malhotra, Shipra, Karanicolas, John

A Numerical Transform of Random Forest Regressors corrects Systematically-Biased Predictions

arXiv.org Machine LearningMar-16-2020

Over the past decade, random forest models have become widely used as a robust method for high-dimensional data regression tasks. In part, the popularity of these models arises from the fact that they require little hyperparameter tuning and are not very susceptible to overfitting. Random forest regression models are comprised of an ensemble of decision trees that independently predict the value of a (continuous) dependent variable; predictions from each of the trees are ultimately averaged to yield an overall predicted value from the forest. Using a suite of representative real-world datasets, we find a systematic bias in predictions from random forest models. We find that this bias is recapitulated in simple synthetic datasets, regardless of whether or not they include irreducible error (noise) in the data, but that models employing boosting do not exhibit this bias. Here we demonstrate the basis for this problem, and we use the training data to define a numerical transformation that fully corrects it. Application of this transformation yields improved predictions in every one of the real-world and synthetic datasets evaluated in our study.

dataset, prediction, random forest model, (13 more...)

2003.07445

Country:

North America > United States > Iowa > Story County > Ames (0.14)
North America > United States > Kansas > Douglas County > Lawrence (0.14)
North America > United States > California > San Francisco County > San Francisco (0.14)
(8 more...)

Genre:

Research Report > New Finding (1.00)
Research Report > Experimental Study (0.93)

Industry:

Banking & Finance (0.93)
Health & Medicine > Pharmaceuticals & Biotechnology (0.93)
Government > Regional Government > North America Government > United States Government (0.46)
(2 more...)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Ensemble Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Decision Tree Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.88)

#artificialintelligenceMar-15-2020, 22:04:32 GMT

Intuitions on L1 and L2 Regularisation

L1 and L2 regularisation owes its name to L1 and L2 norm of a vector w respectively. A linear regression model that implements L1 norm for regularisation is called lasso regression, and one that implements (squared) L2 norm for regularisation is called ridge regression. Note: Strictly speaking, the last equation (ridge regression) is a loss function with squared L2 norm of the weights (notice the absence of the square root). The regularisation terms are'constraints' by which an optimisation algorithm must'adhere to' when minimising the loss function, apart from having to minimise the error between the true y and the predicted ŷ. Let's define a model to see how L1 and L2 work.

equation, linear regression model, loss function, (11 more...)

#artificialintelligence

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (1.00)