Goto

Collaborating Authors

 Regression


On the Tractability of SHAP Explanations

Journal of Artificial Intelligence Research

Shap explanations are a popular feature-attribution mechanism for explainable AI. They use game-theoretic notions to measure the influence of individual features on the prediction of a machine learning model. Despite a lot of recent interest from both academia and industry, it is not known whether Shap explanations of common machine learning models can be computed efficiently. In this paper, we establish the complexity of computing the Shap explanation in three important settings. First, we consider fully-factorized data distributions, and show that the complexity of computing the Shap explanation is the same as the complexity of computing the expected value of the model. This fully-factorized setting is often used to simplify the Shap computation, yet our results show that the computation can be intractable for commonly used models such as logistic regression. Going beyond fully-factorized distributions, we show that computing Shap explanations is already intractable for a very simple setting: computing Shap explanations of trivial classifiers over naive Bayes distributions. Finally, we show that even computing Shap over the empirical distribution is #P-hard.


Using regression techniques to predict a student's grade for a course

#artificialintelligence

I will be using Keras and TensorFlow to train a deep neural network to predict the grade using 2 hidden layers, mean squared error loss, and an RMSprop optimizer. Let's graph the error and the loss during training and evaluate the model We are getting a 0.69 mean absolute error with this approach. We also need to save the model to deploy it in an API. Since I am using google Colab I can easily save it to google drive. Initialize a random forest with 100 decision trees and train it on the same data.


Functional Nonlinear Learning

arXiv.org Machine Learning

Using representations of functional data can be more convenient and beneficial in subsequent statistical models than direct observations. These representations, in a lower-dimensional space, extract and compress information from individual curves. The existing representation learning approaches in functional data analysis usually use linear mapping in parallel to those from multivariate analysis, e.g., functional principal component analysis (FPCA). However, functions, as infinite-dimensional objects, sometimes have nonlinear structures that cannot be uncovered by linear mapping. Linear methods will be more overwhelmed given multivariate functional data. For that matter, this paper proposes a functional nonlinear learning (FunNoL) method to sufficiently represent multivariate functional data in a lower-dimensional feature space. Furthermore, we merge a classification model for enriching the ability of representations in predicting curve labels. Hence, representations from FunNoL can be used for both curve reconstruction and classification. Additionally, we have endowed the proposed model with the ability to address the missing observation problem as well as to further denoise observations. The resulting representations are robust to observations that are locally disturbed by uncontrollable random noises. We apply the proposed FunNoL method to several real data sets and show that FunNoL can achieve better classifications than FPCA, especially in the multivariate functional data setting. Simulation studies have shown that FunNoL provides satisfactory curve classification and reconstruction regardless of data sparsity.


Predicting the Geoeffectiveness of CMEs Using Machine Learning

arXiv.org Artificial Intelligence

ABSTRACT Coronal mass ejections (CMEs) are the most geoeffective space weather phenomena, being associated with large geomagnetic storms, having the potential to cause disturbances to telecommunication, satellite network disruptions, power grid damages and failures. Thus, considering these storms' potential effects on human activities, accurate forecasts of the geoeffectiveness of CMEs are paramount. This work focuses on experimenting with different machine learning methods trained on white-light coronagraph datasets of close to sun CMEs, to estimate whether such a newly erupting ejection has the potential to induce geomagnetic activity. We developed binary classification models using logistic regression, K-Nearest Neighbors, Support Vector Machines, feed forward artificial neural networks, as well as ensemble models. At this time, we limited our forecast to exclusively use solar onset parameters, to ensure extended warning times. We discuss the main challenges of this task, namely the extreme imbalance between the number of geoeffective and ineffective events in our dataset, along with their numerous similarities and the limited number of available variables. We show that even in such conditions, adequate hit rates can be achieved with these models. INTRODUCTION The purpose of this work is to develop a machine learning (ML) based model that can predict whether a coronal mass ejection (CME) will be geoeffective, using only numerical solar parameters as input. Coronal mass ejections are solar eruptive events whose magnetically charged particles can, directly or indirectly, under certain circumstances, reach Earth and cause geomagnetic storms (GSs), i.e., be geoeffective. These storms represent perturbations in the Earth's magnetic field, which have the potential to lead to electrical systems and grids failure and/or damage, power outages, navigation errors, radio signal perturbations, significant exposure to dangerous radiations for astronauts during space missions, etc. Given the potential negative impacts of such storms, predicting their occurrence is paramount for enabling safeguarding of human technology (Schwenn 2006; Pulkkinen 2007; Council 2013; Vourlidas et al. 2019; Temmer 2021). The intensity of the storms can be measured by various geomagnetic indices such as Ap, Kp, AE, PC or Dst (see Lockwood 2013, and references therein). Herein, we have chosen to use the values of the Dst index (Sugiura 1964) to establish whether the magnetic field perturbations do, in fact, manifest as storms. This is an index that is calculated using four geomagnetic stations situated at low latitudes. Depending on the value of this index, it can be established whether these perturbations are associated with geomagnetic storms or not. In terms of storm intensity, one of the most popular classifications that takes into consideration the minimum value of the Dst index is that of Gonzalez et al. (1994).


Supervised Machine Learning: Regression and Classification

#artificialintelligence

In this beginner-friendly program, you will learn the fundamentals of machine learning and how to use these techniques to build real-world AI applications. This Specialization is taught by Andrew Ng, an AI visionary who has led critical research at Stanford University and groundbreaking work at Google Brain, Baidu, and Landing.AI to advance the AI field. This 3-course Specialization is an updated and expanded version of Andrew's pioneering Machine Learning course, rated 4.9 out of 5 and taken by over 4.8 million learners since it launched in 2012. It provides a broad introduction to modern machine learning, including supervised learning (multiple linear regression, logistic regression, neural networks, and decision trees), unsupervised learning (clustering, dimensionality reduction, recommender systems), and some of the best practices used in Silicon Valley for artificial intelligence and machine learning innovation (evaluating and tuning models, taking a data-centric approach to improving performance, and more.) By the end of this Specialization, you will have mastered key concepts and gained the practical know-how to quickly and powerfully apply machine learning to challenging real-world problems.


Regression of high dimensional angular momentum states of light

arXiv.org Artificial Intelligence

The Orbital Angular Momentum (OAM) of light is an infinite-dimensional degree of freedom of light with several applications in both classical and quantum optics. However, to fully take advantage of the potential of OAM states, reliable detection platforms to characterize generated states in experimental conditions are needed. Here, we present an approach to reconstruct input OAM states from measurements of the spatial intensity distributions they produce. To obviate issues arising from intrinsic symmetry of Laguerre-Gauss modes, we employ a pair of intensity profiles per state projecting it only on two distinct bases, showing how this allows to uniquely recover input states from the collected data. Our approach is based on a combined application of dimensionality reduction via principal component analysis, and linear regression, and thus has a low computational cost during both training and testing stages. We showcase our approach in a real photonic setup, generating up-to-four-dimensional OAM states through a quantum walk dynamics. The high performances and versatility of the demonstrated approach make it an ideal tool to characterize high dimensional states in quantum information protocols.


Quantitative CT texture-based method to predict diagnosis and prognosis of fibrosing interstitial lung disease patterns

arXiv.org Machine Learning

Purpose: To utilize high-resolution quantitative CT (QCT) imaging features for prediction of diagnosis and prognosis in fibrosing interstitial lung diseases (ILD). Approach: 40 ILD patients (20 usual interstitial pneumonia (UIP), 20 non-UIP pattern ILD) were classified by expert consensus of 2 radiologists and followed for 7 years. Clinical variables were recorded. Following segmentation of the lung field, a total of 26 texture features were extracted using a lattice-based approach (TM model). The TM model was compared with previously histogram-based model (HM) for their abilities to classify UIP vs non-UIP. For prognostic assessment, survival analysis was performed comparing the expert diagnostic labels versus TM metrics. Results: In the classification analysis, the TM model outperformed the HM method with AUC of 0.70. While survival curves of UIP vs non-UIP expert labels in Cox regression analysis were not statistically different, TM QCT features allowed statistically significant partition of the cohort. Conclusions: TM model outperformed HM model in distinguishing UIP from non-UIP patterns. Most importantly, TM allows for partitioning of the cohort into distinct survival groups, whereas expert UIP vs non-UIP labeling does not. QCT TM models may improve diagnosis of ILD and offer more accurate prognostication, better guiding patient management.


Noise Estimation in Gaussian Process Regression

arXiv.org Machine Learning

We develop a computational procedure to estimate the covariance hyperparameters for semiparametric Gaussian process regression models with additive noise. Namely, the presented method can be used to efficiently estimate the variance of the correlated error, and the variance of the noise based on maximizing a marginal likelihood function. Our method involves suitably reducing the dimensionality of the hyperparameter space to simplify the estimation procedure to a univariate root-finding problem. Moreover, we derive bounds and asymptotes of the marginal likelihood function and its derivatives, which are useful to narrowing the initial range of the hyperparameter search. Using numerical examples, we demonstrate the computational advantages and robustness of the presented approach compared to traditional parameter optimization.


Linear Regression: Mathematical Intuition

#artificialintelligence

Since the start of your data scientist journey, you have been commonly accustomed with this machine learning algorithm. Linear Regression as it is the basic and foremost machine learning algorithm we generally start with while analysing different regression problems. As the word linear says, the linear relationship between input variables(x) with the dependent output variable(y). Basically the linear regression analysis performs the task of predicticting the output variable by modelling or finding relationships between the independent variables(x). And the approach of finding the best ouput is by fitting the predicted line towards the best fit line.


Math for Machine Learning: 14 Must-Read Books - Machine Learning Techniques

#artificialintelligence

It is possible to design and deploy advanced machine learning algorithms that are essentially math-free and stats-free. People working on that are typically professional mathematicians. These algorithms are not necessarily simpler. See for instance a math-free regression technique with prediction intervals, here. Or supervised classification and alternative to t-SNE, here. Interestingly, this latter math-free machine