AITopics

2106.09473

Country:

North America > United States > California > San Francisco County > San Francisco (0.13)
North America > United States > California > Alameda County > Berkeley (0.13)
Europe > Spain > Canary Islands (0.04)
(16 more...)

Genre:

Research Report > New Finding (1.00)
Research Report > Experimental Study (1.00)

Industry:

Information Technology (1.00)
Health & Medicine > Therapeutic Area > Neurology (1.00)
Health & Medicine > Pharmaceuticals & Biotechnology (1.00)
(4 more...)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Decision Tree Learning (1.00)
(3 more...)

Charlton, Colleen E., Poon, Michael Tin Chung, Brennan, Paul M., Fleuriot, Jacques D.

Interpretable Machine Learning Classifiers for Brain Tumour Survival Prediction

arXiv.org Artificial IntelligenceJun-17-2021

Prediction of survival in patients diagnosed with a brain tumour is challenging because of heterogeneous tumour behaviours and responses to treatment. Better estimations of prognosis would support treatment planning and patient support. Advances in machine learning have informed development of clinical predictive models, but their integration into clinical practice is almost non-existent. One reasons for this is the lack of interpretability of models. In this paper, we use a novel brain tumour dataset to compare two interpretable rule list models against popular machine learning approaches for brain tumour survival prediction. All models are quantitatively evaluated using standard performance metrics. The rule lists are also qualitatively assessed for their interpretability and clinical utility. The interpretability of the black box machine learning models is evaluated using two post-hoc explanation techniques, LIME and SHAP. Our results show that the rule lists were only slightly outperformed by the black box models. We demonstrate that rule list algorithms produced simple decision lists that align with clinical expertise. By comparison, post-hoc interpretability methods applied to black box models may produce unreliable explanations of local model predictions. Model interpretability is essential for understanding differences in predictive performance and for integration into clinical practice.

dataset, prediction, survival, (16 more...)

arXiv.org Artificial Intelligence

2106.09424

Country:

North America > United States > California > San Francisco County > San Francisco (0.14)
North America > United States > New York > New York County > New York City (0.04)
North America > Canada > Ontario > Toronto (0.04)
(4 more...)

Genre:

Research Report > New Finding (1.00)
Research Report > Experimental Study (1.00)

Industry:

Health & Medicine > Therapeutic Area > Neurology (1.00)
Health & Medicine > Pharmaceuticals & Biotechnology (1.00)
Health & Medicine > Diagnostic Medicine (1.00)
(2 more...)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Rule-Based Reasoning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Expert Systems (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.46)

#artificialintelligenceJun-16-2021, 21:50:04 GMT

Out-and-Out in Artificial Neural Networks with Keras

When I started reading articles on neural networks, I faced a lot of struggles to understand the basics behind neural networks and how they work. Start reading more and more articles on the internet, grab those key points, and put them together into private notes for me. And, I thought to publish them for better understandings to others. It would be fun to know the basics of any domain. The perceptron is one of the simplest ANN Architectures, invented in 1957 by Frank Rosenblatt.

neural network, neuron, prediction, (16 more...)

#artificialintelligence

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.31)

#artificialintelligenceJun-16-2021, 18:26:45 GMT

Simple Linear Regression: A layman's explanation

Machine learning and statistics have many applications in business and the social sciences. However, the theory is often intimidating and not easily understood. In this series of articles, I aim to demystify the concepts behind the common tools used in data science and machine learning, starting with linear regression. Linear regression is a statistical method that allows us to describe relationships between variables (distinct things that can be measured or recorded, such as height, weight, and hair colour). It is an extension of the General Linear Model, a framework to describe how a variable of interest can be modelled using other predictor variables. In simple linear regression (SLR), we focus on the relationship between two continuous variables, x and y (hence, simple).

exam result, regression, variation, (13 more...)

#artificialintelligence

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (1.00)

Kelner, Jonathan, Koehler, Frederic, Meka, Raghu, Rohatgi, Dhruv

On the Power of Preconditioning in Sparse Linear Regression

Sparse linear regression is a fundamental problem in high-dimensional statistics, but strikingly little is known about how to efficiently solve it without restrictive conditions on the design matrix. We consider the (correlated) random design setting, where the covariates are independently drawn from a multivariate Gaussian $N(0,\Sigma)$ with $\Sigma : n \times n$, and seek estimators $\hat{w}$ minimizing $(\hat{w}-w^*)^T\Sigma(\hat{w}-w^*)$, where $w^*$ is the $k$-sparse ground truth. Information theoretically, one can achieve strong error bounds with $O(k \log n)$ samples for arbitrary $\Sigma$ and $w^*$; however, no efficient algorithms are known to match these guarantees even with $o(n)$ samples, without further assumptions on $\Sigma$ or $w^*$. As far as hardness, computational lower bounds are only known with worst-case design matrices. Random-design instances are known which are hard for the Lasso, but these instances can generally be solved by Lasso after a simple change-of-basis (i.e. preconditioning). In this work, we give upper and lower bounds clarifying the power of preconditioning in sparse linear regression. First, we show that the preconditioned Lasso can solve a large class of sparse linear regression problems nearly optimally: it succeeds whenever the dependency structure of the covariates, in the sense of the Markov property, has low treewidth -- even if $\Sigma$ is highly ill-conditioned. Second, we construct (for the first time) random-design instances which are provably hard for an optimally preconditioned Lasso. In fact, we complete our treewidth classification by proving that for any treewidth-$t$ graph, there exists a Gaussian Markov Random Field on this graph such that the preconditioned Lasso, with any choice of preconditioner, requires $\Omega(t^{1/20})$ samples to recover $O(\log n)$-sparse signals when covariates are drawn from this model.

artificial intelligence, machine learning, matrix, (19 more...)

2106.09207

Genre: Research Report > New Finding (0.45)

Industry: Energy > Oil & Gas (1.00)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (1.00)

Diamandis, Theo, Eldar, Yonina C., Fallah, Alireza, Farnia, Farzan, Ozdaglar, Asuman

A Wasserstein Minimax Framework for Mixed Linear Regression

Multi-modal distributions are commonly used to model clustered data in statistical learning tasks. In this paper, we consider the Mixed Linear Regression (MLR) problem. We propose an optimal transport-based framework for MLR problems, Wasserstein Mixed Linear Regression (WMLR), which minimizes the Wasserstein distance between the learned and target mixture regression models. Through a model-based duality analysis, WMLR reduces the underlying MLR task to a nonconvex-concave minimax optimization problem, which can be provably solved to find a minimax stationary point by the Gradient Descent Ascent (GDA) algorithm. In the special case of mixtures of two linear regression models, we show that WMLR enjoys global convergence and generalization guarantees. We prove that WMLR's sample complexity grows linearly with the dimension of data. Finally, we discuss the application of WMLR to the federated learning task where the training samples are collected by multiple agents in a network. Unlike the Expectation Maximization algorithm, WMLR directly extends to the distributed, federated learning setting. We support our theoretical results through several numerical experiments, which highlight our framework's ability to handle the federated learning setting with mixture models.

algorithm, exp, wasserstein minimax framework, (14 more...)

2106.07537

Country:

North America > United States > Massachusetts > Middlesex County > Cambridge (0.14)
Asia > Middle East > Jordan (0.04)
North America > United States > Florida > Broward County > Fort Lauderdale (0.04)
Asia > Middle East > Israel (0.04)

Genre: Research Report (1.00)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (1.00)

Han, Xu, Fang, Ethan X, Tang, Cheng Yong

Pre-processing with Orthogonal Decompositions for High-dimensional Explanatory Variables

Strong correlations between explanatory variables are problematic for high-dimensional regularized regression methods. Due to the violation of the Irrepresentable Condition, the popular LASSO method may suffer from false inclusions of inactive variables. In this paper, we propose pre-processing with orthogonal decompositions (PROD) for the explanatory variables in high-dimensional regressions. The PROD procedure is constructed based upon a generic orthogonal decomposition of the design matrix. We demonstrate by two concrete cases that the PROD approach can be effectively constructed for improving the performance of high-dimensional penalized regression. Our theoretical analysis reveals their properties and benefits for high-dimensional penalized linear regression with LASSO.

irrepresentable condition, matrix, prod, (17 more...)

2106.09071

Country:

North America > United States > Pennsylvania > Philadelphia County > Philadelphia (0.04)
North America > United States > Pennsylvania > Centre County > University Park (0.04)
North America > United States > New York (0.04)
Asia > Middle East > Jordan (0.04)

Genre: Research Report (1.00)

Industry:

Health & Medicine > Therapeutic Area > Oncology (1.00)
Health & Medicine > Pharmaceuticals & Biotechnology (0.92)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.96)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.87)

Kowal, Daniel R., Wu, Bohan

Semiparametric count data regression for self-reported mental health

"For how many days during the past 30 days was your mental health not good?" The responses to this question measure self-reported mental health and can be linked to important covariates in the National Health and Nutrition Examination Survey (NHANES). However, these count variables present major distributional challenges: the data are overdispersed, zero-inflated, bounded by 30, and heaped in five- and seven-day increments. To meet these challenges, we design a semiparametric estimation and inference framework for count data regression. The data-generating process is defined by simultaneously transforming and rounding (STAR) a latent Gaussian regression model. The transformation is estimated nonparametrically and the rounding operator ensures the correct support for the discrete and bounded data. Maximum likelihood estimators are computed using an EM algorithm that is compatible with any continuous data model estimable by least squares. STAR regression includes asymptotic hypothesis testing and confidence intervals, variable selection via information criteria, and customized diagnostics. Simulation studies validate the utility of this framework. STAR is deployed to study the factors associated with self-reported mental health and demonstrates substantial improvements in goodness-of-fit compared to existing count data regression models.

mental health, regression, regression model, (16 more...)

2106.09114

Country:

Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
North America > United States > California > Monterey County > Pacific Grove (0.04)
Asia > Middle East > Jordan (0.04)

Genre: Research Report (1.00)

Industry: Health & Medicine > Therapeutic Area > Psychiatry/Psychology (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.73)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.48)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.34)

Leveraging Probabilistic Circuits for Nonparametric Multi-Output Regression

Yu, Zhongjie, Zhu, Mingye, Trapp, Martin, Skryagin, Arseny, Kersting, Kristian

DN), thus, limiting their use to moderately sized data sets. To enable posterior inference in GPs on large-scale problems, Inspired by recent advances in the field of expertbased recent work (see e.g. Liu et al. [2020] for a detailed approximations of Gaussian processes (GPs), review) mainly resorts to global approximations to the posterior, we present an expert-based approach to large-scale e.g., using inducing points, or local approximations multi-output regression using single-output GP that aim to distribute the computation of the posterior distribution experts. Employing a deeply structured mixture onto local experts. Unfortunately, most of these of single-output GPs encoded via a probabilistic approaches only focus on single-output regression, i.e., the circuit allows us to capture correlations between dependent variable is univariate, and in the case of local multiple output dimensions accurately. By recursively approximations, do not easily extend to multi-output regression partitioning the covariate space and the output tasks, see Bruinsma et al. [2020] for a detailed space, posterior inference in our model reduces to discussion on recent techniques on multi-output GPs.

momogp, output space, regression, (16 more...)

2106.08687

Country:

North America > United States > California > Los Angeles County > Los Angeles (0.14)
North America > Canada > Ontario > Toronto (0.14)
Europe > Germany > Hesse > Darmstadt Region > Darmstadt (0.05)
(2 more...)

Genre: Research Report (0.82)

Technology:

Information Technology > Data Science (0.93)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.68)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.46)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.46)

arXiv.org Artificial IntelligenceJun-15-2021

A Markov Reward Process-Based Approach to Spatial Interpolation

Arp, Laurens

The interpolation of spatial data can be of tremendous value in various applications, such as forecasting weather from only a few measurements of meteorological or remote sensing data. Existing methods for spatial interpolation, such as variants of kriging and spatial autoregressive models, tend to suffer from at least one of the following limitations: (a) the assumption of stationarity, (b) the assumption of isotropy, and (c) the trade-off between modelling local or global spatial interaction. Addressing these issues in this work, we propose the use of Markov reward processes (MRPs) as a spatial interpolation method, and we introduce three variants thereof: (i) a basic static discount MRP (SD-MRP), (ii) an accurate but mostly theoretical optimised MRP (O-MRP), and (iii) a transferable weight prediction MRP (WP-MRP). All variants of MRP interpolation operate locally, while also implicitly accounting for global spatial relationships in the entire system through recursion. Additionally, O-MRP and WP-MRP no longer assume stationarity and are robust to anisotropy. We evaluated our proposed methods by comparing the mean absolute errors of their interpolated grid cells to those of 7 common baselines, selected from models based on spatial autocorrelation, (spatial) regression, and deep learning. We performed detailed evaluations on two publicly available datasets (local GDP values, and COVID-19 patient trajectory data). The results from these experiments clearly show the competitive advantage of MRP interpolation, which achieved significantly lower errors than the existing methods in 23 out of 40 experimental conditions, or 35 out of 40 when including O-MRP.

deep learning, interpolation, neural network, (20 more...)

arXiv.org Artificial Intelligence

2106.00538

Country:

Asia > South Korea (0.51)
Asia > Taiwan (0.16)
Europe > Netherlands > South Holland (0.14)

Genre:

Research Report > Experimental Study (1.00)
Research Report > New Finding (0.87)

Industry:

Health & Medicine > Therapeutic Area > Infections and Infectious Diseases (1.00)
Energy (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Spatial Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.47)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.34)