AITopics

2101.01918

Country:

Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
North America > United States > Pennsylvania > Philadelphia County > Philadelphia (0.04)
North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)
(2 more...)

Genre: Research Report > New Finding (0.67)

Industry: Education (0.48)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Transfer Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Perceptrons (0.60)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.46)

Haywood-Alexander, Marcus, Dervilis, Nikolaos, Worden, Keith, Cross, Elizabeth J., Mills, Robin S., Rogers, Timothy J.

Structured Machine Learning Tools for Modelling Characteristics of Guided Waves

The use of ultrasonic guided waves to probe the materials/structures for damage continues to increase in popularity for non-destructive evaluation (NDE) and structural health monitoring (SHM). The use of high-frequency waves such as these offers an advantage over low-frequency methods from their ability to detect damage on a smaller scale. However, in order to assess damage in a structure, and implement any NDE or SHM tool, knowledge of the behaviour of a guided wave throughout the material/structure is important (especially when designing sensor placement for SHM systems). Determining this behaviour is extremely diffcult in complex materials, such as fibre-matrix composites, where unique phenomena such as continuous mode conversion takes place. This paper introduces a novel method for modelling the feature-space of guided waves in a composite material. This technique is based on a data-driven model, where prior physical knowledge can be used to create structured machine learning tools; where constraints are applied to provide said structure. The method shown makes use of Gaussian processes, a full Bayesian analysis tool, and in this paper it is shown how physical knowledge of the guided waves can be utilised in modelling using an ML tool. This paper shows that through careful consideration when applying machine learning techniques, more robust models can be generated which offer advantages such as extrapolation ability and physical interpretation.

artificial intelligence, kernel, upstream oil & gas, (19 more...)

2101.01506

Country: Europe > United Kingdom (0.46)

Genre: Research Report > New Finding (1.00)

Industry: Energy > Oil & Gas > Upstream (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.68)
Information Technology > Artificial Intelligence > Machine Learning > Evolutionary Systems (0.67)

David, Daniel Ben, Resheff, Yehezkel S., Tron, Talia

Explainable AI and Adoption of Algorithmic Advisors: an Experimental Study

arXiv.org Artificial IntelligenceJan-5-2021

Machine learning is becoming a commonplace part of our technological experience. The notion of explainable AI (XAI) is attractive when regulatory or usability considerations necessitate the ability to back decisions with a coherent explanation. A large body of research has addressed algorithmic methods of XAI, but it is still unclear how to determine what is best suited to create human cooperation and adoption of automatic systems. Here we develop an experimental methodology where participants play a web-based game, during which they receive advice from either a human or algorithmic advisor, accompanied with explanations that vary in nature between experimental conditions. We use a reference-dependent decision-making framework, evaluate the game results over time, and in various key situations, to determine whether the different types of explanations affect the readiness to adopt, willingness to pay and trust a financial AI consultant. We find that the types of explanations that promotes adoption during first encounter differ from those that are most successful following failure or when cost is involved. Furthermore, participants are willing to pay more for AI-advice that includes explanations. These results add to the literature on the importance of XAI for algorithmic adoption and trust.

adoption, explanation, participant, (15 more...)

arXiv.org Artificial Intelligence

2101.02555

Country:

Europe (0.14)
North America > United States (0.04)
Asia > Middle East > Jordan (0.04)
Asia > Middle East > Israel > Jerusalem District > Jerusalem (0.04)

Genre:

Research Report > New Finding (1.00)
Research Report > Experimental Study (1.00)
Questionnaire & Opinion Survey (1.00)

Industry: Information Technology (0.46)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Explanation & Argumentation (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.46)

Raymaekers, Jakob, Verbeke, Wouter, Verdonck, Tim

Weight-of-evidence 2.0 with shrinkage and spline-binning

In many practical applications, such as fraud detection, credit risk modeling or medical decision making, classification models for assigning instances to a predefined set of classes are required to be both precise as well as interpretable. Linear modeling methods such as logistic regression are often adopted, since they offer an acceptable balance between precision and interpretability. Linear methods, however, are not well equipped to handle categorical predictors with high-cardinality or to exploit non-linear relations in the data. As a solution, data preprocessing methods such as weight-of-evidence are typically used for transforming the predictors. The binning procedure that underlies the weight-of-evidence approach, however, has been little researched and typically relies on ad-hoc or expert driven procedures. The objective in this paper, therefore, is to propose a formalized, data-driven and powerful method. To this end, we explore the discretization of continuous variables through the binning of spline functions, which allows for capturing non-linear effects in the predictor variables and yields highly interpretable predictors taking only a small number of discrete values. Moreover, we extend upon the weight-of-evidence approach and propose to estimate the proportions using shrinkage estimators. Together, this offers an improved ability to exploit both non-linear and categorical predictors for achieving increased classification precision, while maintaining interpretability of the resulting model and decreasing the risk of overfitting. We present the results of a series of experiments in a fraud detection setting, which illustrate the effectiveness of the presented approach. We facilitate reproduction of the presented results and adoption of the proposed approaches by providing both the dataset and the code for implementing the experiments and the presented approach.

categorical variable, category, woe value, (14 more...)

2101.01494

Country:

Europe > Belgium > Flanders > Flemish Brabant > Leuven (0.05)
Europe > Belgium > Flanders > Antwerp Province > Antwerp (0.04)
Oceania > Australia (0.04)
(8 more...)

Genre:

Research Report > Experimental Study (0.68)
Research Report > New Finding (0.50)

Industry:

Health & Medicine (1.00)
Law Enforcement & Public Safety > Fraud (0.87)
Banking & Finance > Credit (0.66)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.51)

Zhang, Dongcheng, Zhang, Kunpeng

Weighting-Based Treatment Effect Estimation via Distribution Learning

Existing weighting methods for treatment effect estimation are often built upon the idea of propensity scores or covariate balance. They usually impose strong assumptions on treatment assignment or outcome model to obtain unbiased estimation, such as linearity or specific functional forms, which easily leads to the major drawback of model mis-specification. In this paper, we aim to alleviate these issues by developing a distribution learning-based weighting method. We first learn the true underlying distribution of covariates conditioned on treatment assignment, then leverage the ratio of covariates' density in the treatment group to that of the control group as the weight for estimating treatment effects. Specifically, we propose to approximate the distribution of covariates in both treatment and control groups through invertible transformations via change of variables. To demonstrate the superiority, robustness, and generalizability of our method, we conduct extensive experiments using synthetic and real data. From the experiment results, we find that our method for estimating average treatment effect on treated (ATT) with observational data outperforms several cutting-edge weighting-only benchmarking methods, and it maintains its advantage under a doubly-robust estimation framework that combines weighting with some advanced outcome modeling methods.

covariate, epoch number, estimation, (15 more...)

2012.13805

Country:

North America > United States > Maryland > Prince George's County > College Park (0.14)
Oceania > Australia > New South Wales > Sydney (0.04)
North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)
(3 more...)

Genre: Research Report > Experimental Study (1.00)

Industry: Health & Medicine (0.67)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.93)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.93)

Lubold, Shane, Chandrasekhar, Arun G., McCormick, Tyler H.

Identifying the latent space geometry of network models through analysis of curvature

Statistically modeling networks, across numerous disciplines and contexts, is fundamentally challenging because of (often high-order) dependence between connections. A common approach assigns each person in the graph to a position on a low-dimensional manifold. Distance between individuals in this (latent) space is inversely proportional to the likelihood of forming a connection. The choice of the latent geometry (the manifold class, dimension, and curvature) has consequential impacts on the substantive conclusions of the model. More positive curvature in the manifold, for example, encourages more and tighter communities; negative curvature induces repulsion among nodes. Currently, however, the choice of the latent geometry is an a priori modeling assumption and there is limited guidance about how to make these choices in a data-driven way. In this work, we present a method to consistently estimate the manifold type, dimension, and curvature from an empirically relevant class of latent spaces: simply connected, complete Riemannian manifolds of constant curvature. Our core insight comes by representing the graph as a noisy distance matrix based on the ties between cliques. Leveraging results from statistical geometry, we develop hypothesis tests to determine whether the observed distances could plausibly be embedded isometrically in each of the candidate geometries. We explore the accuracy of our approach with simulations and then apply our approach to data-sets from economics and sociology as well as neuroscience.

clique, curvature, geometry, (15 more...)

2012.10559

Country:

North America > United States (0.14)
North America > Canada > Ontario (0.04)
Europe > United Kingdom > England > Oxfordshire > Oxford (0.04)
(2 more...)

Genre:

Research Report > Experimental Study (1.00)
Research Report > New Finding (0.93)

Industry: Health & Medicine > Therapeutic Area > Neurology (0.87)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Communications > Social Media (0.93)
Information Technology > Communications > Networks (0.82)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.46)

Toutiaee, Mohammadhossein, Miller, John

Gaussian Function On Response Surface Estimation

arXiv.org Machine LearningJan-3-2021

We propose a new framework for 2-D interpreting (features and samples) black-box machine learning models via a metamodeling technique, by which we study the output and input relationships of the underlying machine learning model. The metamodel can be estimated from data generated via a trained complex model by running the computer experiment on samples of data in the region of interest. We utilize a Gaussian process as a surrogate to capture the response surface of a complex model, in which we incorporate two parts in the process: interpolated values that are modeled by a stationary Gaussian process Z governed by a prior covariance function, and a mean function mu that captures the known trends in the underlying model. The optimization procedure for the variable importance parameter theta is to maximize the likelihood function. This theta corresponds to the correlation of individual variables with the target response. There is no need for any pre-assumed models since it depends on empirical observations. Experiments demonstrate the potential of the interpretable model through quantitative assessment of the predicted samples.

g-forse, metamodel, prediction, (16 more...)

2101.00772

Country: North America > United States > Georgia > Clarke County > Athens (0.14)

Genre: Research Report > New Finding (0.31)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.48)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.31)

Villanueva, Nora M., Sestelo, Marta, Ordóñez, Celestino, Roca-Pardiñas, Javier

An automatic procedure to determine groups of nonparametric regression curves

arXiv.org Machine LearningDec-30-2020

One of the main goals of statistical modelling is to understand the dependence of a response variable, Y, with respect to another explanatory variable, X. This type of dependence can be studied through nonparametric regression models, where the relationship between Y and X is modelled without specifying in advance the function that links them. Within this framework, the study of the regression curves can be useful in the comparison of two or more groups, which is an important problem associated with statistical inference. In particular, the topic of hypothesis testing the equality of mean functions has been widely investigated in the literature, see, for instance, the review that González-Manteiga and Crujeiras (2013) offers about this topic. Relevant papers on this topic are Hall and Hart (1990); King et al. (1991); Delgado (1993); Kulasekera (1995); Young and Bowman (1995); Dette and Neumeyer (2001); Pardo-Fernández et al. (2007); Srihera and Stute (2010), among others. Furthermore, in order to compare the values of a response variable across several groups in the presence of a covariate effect, nonparametric analysis of covariance or factor-by-curve interaction test can be used. Young and Bowman (1995) generalized the one-way analysis of variance test to the nonparametric regression setting, and Dette and Neumeyer (2001) proposed to use Young and Bowman's test also in the situation of a heteroscedastic error. In addition, Park and Kang (2008) developed a SiZer tool based on an analysis of variance type test statistic that is capable of comparing multiple curves based on the residuals. The evolution of this procedure is based on the comparison using the original regression curves (Park et al., 2014).

nonparametric regression curve, procedure, regression curve, (15 more...)

2012.15278

Country:

North America > United States > New York (0.04)
Europe > Spain > Asturias > Oviedo Province > Oviedo (0.04)

Genre: Research Report > Experimental Study (0.93)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.35)

arXiv.org Machine LearningDec-29-2020

Inference for Low-rank Tensors -- No Need to Debias

Xia, Dong, Zhang, Anru R., Zhou, Yuchen

In this paper, we consider the statistical inference for several low-rank tensor models. Specifically, in the Tucker low-rank tensor PCA or regression model, provided with any estimates achieving some attainable error rate, we develop the data-driven confidence regions for the singular subspace of the parameter tensor based on the asymptotic distribution of an updated estimate by two-iteration alternating minimization. The asymptotic distributions are established under some essential conditions on the signal-to-noise ratio (in PCA model) or sample size (in regression model). If the parameter tensor is further orthogonally decomposable, we develop the methods and theory for inference on each individual singular vector. For the rank-one tensor PCA model, we establish the asymptotic distribution for general linear forms of principal components and confidence interval for each entry of the parameter tensor. Finally, numerical simulations are presented to corroborate our theoretical discoveries. In all these models, we observe that different from many matrix/vector settings in existing work, debiasing is not required to establish the asymptotic distribution of estimates or to make statistical inference on low-rank tensors. In fact, due to the widely observed statistical-computational-gap for low-rank tensor estimation, one usually requires stronger conditions than the statistical (or information-theoretic) limit to ensure the computationally feasible estimation is achievable. Surprisingly, such conditions ``incidentally" render a feasible low-rank tensor inference without debiasing.

inequality, inference, probability, (14 more...)

2012.14844

Country:

Africa > Senegal > Kolda Region > Kolda (0.04)
Asia > China > Hong Kong (0.04)
North America > United States > Wisconsin > Dane County > Madison (0.04)
(3 more...)

Genre: Research Report (0.81)

Industry: Health & Medicine (0.46)

Technology:

Information Technology > Data Science (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.54)

Dombry, Clément, Esstafa, Youssef

Behavior of linear L2-boosting algorithms in the vanishing learning rate asymptotic

arXiv.org Machine LearningDec-29-2020

In the past decades, boosting has become a major and powerful prediction method in machine learning. The success of the classification algorithm AdaBoost by Freund and Schapire (1999) demonstrated the possibility to combine many weak learners in a sequential way in order to produce better predictions, with widespread applications in gene expression (Dudoit et al., 2002) or music genre identification (Bergstra et al., 2006), to name only a few. Friedman et al. (2000) were able to see a wider statistical framework that lead to the gradient boosting (Friedman, 2001), where a weak learner (e.g., regression trees) is used to optimize a loss function in a sequential procedure akin to gradient descent. Choosing the loss function according to the statistical problem at hand results in a versatile and efficient tool that can handle classification, regression, quantile regression or survival analysis... The popularity of gradient boosting is also due to its efficient implementation in the R package gbm by Ridgeway (2007). Along the methodological developments, strong theoretical results have justified the good performance of boosting. Consistency of boosting algorithm, i.e. their ability to achieve the optimal Bayes error rate for large samples, is considered in Breiman (2004), Zhang and Yu (2005) or Bartlett and Traskin (2007). The present paper is strongly influenced by Bühlmann 2 and Yu (2003) that proposes an analysis of regression boosting algorithms built on linear base learners thanks to explicit formulas for the boosted predictor and its error rate. In this paper, we focus on gradient boosting for regression with square loss and we briefly describe the corresponding algorithm.

algorithm, equation, proposition 2, (15 more...)

2012.14657

Country:

Asia > Middle East > Jordan (0.05)
North America > United States > New York (0.04)
Europe > France > Bourgogne-Franche-Comté > Doubs > Besançon (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)

Genre: Research Report (1.00)

Industry: Health & Medicine > Pharmaceuticals & Biotechnology (0.34)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.46)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Gradient Descent (0.34)