Goto

Collaborating Authors

 Regression


UCB-based Algorithms for Multinomial Logistic Regression Bandits

arXiv.org Machine Learning

Out of the rich family of generalized linear bandits, perhaps the most well studied ones are logisitc bandits that are used in problems with binary rewards: for instance, when the learner/agent tries to maximize the profit over a user that can select one of two possible outcomes (e.g., `click' vs `no-click'). Despite remarkable recent progress and improved algorithms for logistic bandits, existing works do not address practical situations where the number of outcomes that can be selected by the user is larger than two (e.g., `click', `show me later', `never show again', `no click'). In this paper, we study such an extension. We use multinomial logit (MNL) to model the probability of each one of $K+1\geq 2$ possible outcomes (+1 stands for the `not click' outcome): we assume that for a learner's action $\mathbf{x}_t$, the user selects one of $K+1\geq 2$ outcomes, say outcome $i$, with a multinomial logit (MNL) probabilistic model with corresponding unknown parameter $\bar{\boldsymbol\theta}_{\ast i}$. Each outcome $i$ is also associated with a revenue parameter $\rho_i$ and the goal is to maximize the expected revenue. For this problem, we present MNL-UCB, an upper confidence bound (UCB)-based algorithm, that achieves regret $\tilde{\mathcal{O}}(dK\sqrt{T})$ with small dependency on problem-dependent constants that can otherwise be arbitrarily large and lead to loose regret bounds. We present numerical simulations that corroborate our theoretical results.


Comments on Leo Breiman's paper 'Statistical Modeling: The Two Cultures' (Statistical Science, 2001, 16(3), 199-231)

arXiv.org Machine Learning

Breiman challenged statisticians to think more broadly, to step into the unknown, model-free learning world, with him paving the way forward. Statistics community responded with slight optimism, some skepticism, and plenty of disbelief. Today, we are at the same crossroad anew. Faced with the enormous practical success of model-free, deep, and machine learning, we are naturally inclined to think that everything is resolved. A new frontier has emerged; the one where the role, impact, or stability of the {\it learning} algorithms is no longer measured by prediction quality, but an inferential one -- asking the questions of {\it why} and {\it if} can no longer be safely ignored.


Combining initial chest CT with clinical variables in differentiating coronavirus disease 2019 (COVID-19) pneumonia from influenza pneumonia

#artificialintelligence

Coronavirus disease 2019 (COVID-19) has spread in more than 100 countries and regions around the world, raising grave global concerns. COVID-19 has a similar pattern of infection, clinical symptoms, and chest imaging findings to influenza pneumonia. In this retrospective study, we analysed clinical and chest CT data of 24 patients with COVID-19 and 79 patients with influenza pneumonia. Univariate analysis demonstrated that the temperature, systolic pressure, cough and sputum production could distinguish COVID-19 from influenza pneumonia. The diagnostic sensitivity and specificity for the clinical features are 0.783 and 0.747, and the AUC value is 0.819. Univariate analysis demonstrates that nine CT features, centralโ€“peripheral distribution, superiorโ€“inferior distribution, anteriorโ€“posterior distribution, patches of GGO, GGO nodule, vascular enlargement in GGO, air bronchogram, bronchiectasis within focus, interlobular septal thickening, could distinguish COVID-19 from influenza pneumonia. The diagnostic sensitivity and specificity for the CT features are 0.750 and 0.962, and the AUC value is 0.927. Finally, a multivariate logistic regression model combined the variables from the clinical variables and CT features models was made. The combined model contained six features: systolic blood pressure, sputum production, vascular enlargement in the GGO, GGO nodule, centralโ€“peripheral distribution and bronchiectasis within focus. The diagnostic sensitivity and specificity for the combined features are 0.87 and 0.96, and the AUC value is 0.961. In conclusion, some CT features or clinical variables can differentiate COVID-19 from influenza pneumonia. Moreover, CT features combined with clinical variables had higher diagnostic performance.


PCA Vs Linear Regression - Therefore You Should Know The Differences โ€“ Fly Spaceships With Your Mind

#artificialintelligence

PCA vs Linear Regression โ€“ Two statistical methods that run very similarly. However, they differ in one important respect. What the two methods actually are and what this difference is, we explain to you in the following article. Principal Component Analysis (PCA) is a multivariate statistical method for structuring or simplifying a large data set. The main goal here is the discovery of relationships in 2 or 3 dimensional domain.


Use Machine Learning to Make Apps and AI to Detect Fraud

#artificialintelligence

Make your first machine learning model with the TensorFlow framework. Make an Android app that can analyze and predict handwritten digit data. Make an advanced app with the MNIST database of digits. Make an app that can predict the weather. Description This is our epic course with 5 projects in artificial intelligence and machine learning: 01.


Differentially private inference via noisy optimization

arXiv.org Machine Learning

Over the last decade, differential privacy has evolved from a rigorous paradigm derived by theoretical computer scientists for releasing sensitive data to a technology deployed at scale in numerous applications [Ding et al., 2017, Erlingsson et al., 2014, Garfinkel et al., 2019, Tang et al., 2017]. The setting assumes the existence of a trusted curator who holds the data of individuals in a database, and the goal of privacy is to simultaneously protect individual data while allowing statistical analysis of the aggregate database. Such protection is guaranteed by differential privacy in the context of a remote access query system, where a statistician can only indirectly access the data, e.g., by obtaining noisy summary statistics or outputs of a model. Injecting noise before releasing information to the statistician is essential for preserving privacy, and the noise should be as small as possible in order to optimize statistical performance of the released statistics. In this paper, we consider the problem of estimation and inference for M-estimators. Inspired by the work of Bassily et al. [2014], Lee and Kifer [2018], Song et al. [2013], and Feldman et al. [2020], among others, we propose noisy optimization procedures that output differentially private counterparts of standard M-estimators. The central idea of these methods is to add noise to every iterate of a gradient-based optimization routine in a way that causes each iterate to satisfy a targeted differential privacy guarantee.


Computational Emotion Analysis From Images: Recent Advances and Future Directions

arXiv.org Artificial Intelligence

Understanding the information contained in the increasing repository of data is of vital importance to behavior sciences [34], which aim to predict human decision making and enable wide applications, such as mental health evaluation [14], business recommendation [33], opinion mining [54], and entertainment assistance [78]. Analyzing media data on an affective (emotional) level belongs to affective computing, which is defined as "the computing that relates to, arises from, or influences emotions" [38]. The importance of emotions has been emphasized for decades since Minsky introduced the relationship between intelligence and emotion [31]. One famous claim is "The question is not whether intelligent machines can have any emotions, but whether machines can be intelligent without emotions." Based on the types of media data, the research on affective computing can be classified into different categories, such as text [13, 72], image [75], speech [45], music [64], facial expression [24], video [56, 79], physiological signals [2], and multi-modal data [52, 41, 80]. The adage "a picture is worth a thousand words" indicates that images can convey rich semantics. Therefore, images are used as an important channel to express emotions. Image emotion analysis (IEA) has recently been paid much attention. As compared to analyzing the images' cognitive aspect that is related with objective content [15], such as object classification and semantic segmentation, IEA focuses on understanding what emotions can be induced by the images in viewers.


Locally Weighted Linear Regression in Python

#artificialintelligence

In this article, we will implement a Non-Parametric Learning Algorithm called the Locally Weighted Linear Regression. First, we will look at the difference between the parametric and non-parametric learning algorithms, followed by understanding the weighting Function, predict function, and finally plotting the predictions using Python NumPy and Matplotlib. Parametric -- In a Parametric Algorithm, we have a fixed set of parameters such as theta that we try to find(the optimal value) while training the data. After we have found the optimal values for these parameters, we can put the data aside or erase it from the computer and just use the model with parameters to make predictions. Remember, the model is just a function.


Linear Iterative Feature Embedding: An Ensemble Framework for Interpretable Model

arXiv.org Machine Learning

A new ensemble framework for interpretable model called Linear Iterative Feature Embedding (LIFE) has been developed to achieve high prediction accuracy, easy interpretation and efficient computation simultaneously. The LIFE algorithm is able to fit a wide single-hidden-layer neural network (NN) accurately with three steps: defining the subsets of a dataset by the linear projections of neural nodes, creating the features from multiple narrow single-hidden-layer NNs trained on the different subsets of the data, combining the features with a linear model. The theoretical rationale behind LIFE is also provided by the connection to the loss ambiguity decomposition of stack ensemble methods. Both simulation and empirical experiments confirm that LIFE consistently outperforms directly trained single-hidden-layer NNs and also outperforms many other benchmark models, including multi-layers Feed Forward Neural Network (FFNN), Xgboost, and Random Forest (RF) in many experiments. As a wide single-hidden-layer NN, LIFE is intrinsically interpretable. Meanwhile, both variable importance and global main and interaction effects can be easily created and visualized. In addition, the parallel nature of the base learner building makes LIFE computationally efficient by leveraging parallel computing.


DoubleML -- An Object-Oriented Implementation of Double Machine Learning in R

arXiv.org Machine Learning

Structural equation models provide a quintessential framework for conducting causal inference in statistics, econometrics, machine learning (ML), and other data sciences. The package DoubleML for R (R Core Team, 2020) implements partially linear and interactive structural equation and treatment effect models with high-dimensional confounding variables as considered in Chernozhukov et al. (2018). Estimation and tuning of the machine learning models is based on the powerful functionalities provided by the mlr3 package and the mlr3 ecosystem (Lang et al., 2019). A key element of double machine learning (DML) models are score functions identifying the estimates for the target parameter. These functions play an essential role for valid inference with machine learning methods because they have to satisfy a property called Neyman orthogonality. With the score functions as key elements, DoubleML implements double machine learning in a very general way using object orientation based on the R6 package (Chang, 2020). Currently, DoubleML implements the double / debiased machine learning framework as established in Chernozhukov et al. (2018) for - partially linear regression models (PLR), - partially linear instrumental variable regression models (PLIV), - interactive regression models (IRM), and - interactive instrumental variable regression models (IIVM). The object-oriented implementation of DoubleML is very flexible. The model classes DoubleMLPLR, DoubleMLPLIV, DoubleMLIRM and DoubleIIVM implement the estimation of the nuisance functions via machine learning methods and the computation of the Neyman-orthogonal score function.