Performance Analysis
Evaluation Metrics
Evaluation metrics are used to measure the quality of the statistical or machine learning model. Evaluating machine learning models or algorithms is essential for any project. There are many different types of evaluation metrics available to test a model. These include classification accuracy, logarithmic loss, confusion matrix, and others. Classification accuracy is the ratio of the number of correct predictions to the total number of input samples, which is usually what we refer to when we use the term accuracy.
Choosing a Machine Learning Model
The number of shiny models out there can be overwhelming, which means a lot of times people fall back on a few they trust the most and use them on all new problems. This can lead to sub-optimal results. Today we're going to learn how to quickly and efficiently narrow down the space of available models to find those that are most likely to perform best on your problem type. We'll also see how we can keep track of our models' performances using Weights and Biases and compare them. You can find the accompanying code here.
AKM$^2$D : An Adaptive Framework for Online Sensing and Anomaly Quantification
Yan, Hao, Paynabar, Kamran, Shi, Jianjun
In point-based sensing systems such as coordinate measuring machines (CMM) and laser ultrasonics where complete sensing is impractical due to the high sensing time and cost, adaptive sensing through a systematic exploration is vital for online inspection and anomaly quantification. Most of the existing sequential sampling methodologies focus on reducing the overall fitting error for the entire sampling space. However, in many anomaly quantification applications, the main goal is to estimate sparse anomalous regions in the pixel-level accurately. In this paper, we develop a novel framework named Adaptive Kernelized Maximum-Minimum Distance AKM$^2$D to speed up the inspection and anomaly detection process through an intelligent sequential sampling scheme integrated with fast estimation and detection. The proposed method balances the sampling efforts between the space-filling sampling (exploration) and focused sampling near the anomalous region (exploitation). The proposed methodology is validated by conducting simulations and a case study of anomaly detection in composite sheets using a guided wave test.
A Comparison Study on Nonlinear Dimension Reduction Methods with Kernel Variations: Visualization, Optimization and Classification
Kempfert, Katherine C., Wang, Yishi, Chen, Cuixian, Wong, Samuel W. K.
Because of high dimensionality, correlation among covariates, and noise contained in data, dimension reduction (DR) techniques are often employed to the application of machine learning algorithms. Principal Component Analysis (PCA), Linear Discriminant Analysis (LDA), and their kernel variants (KPCA, KLDA) are among the most popular DR methods. Recently, Supervised Kernel Principal Component Analysis (SKPCA) has been shown as another successful alternative. In this paper, brief reviews of these popular techniques are presented first. We then conduct a comparative performance study based on three simulated datasets, after which the performance of the techniques are evaluated through application to a pattern recognition problem in face image analysis. The gender classification problem is considered on MORPH-II and FG-NET, two popular longitudinal face aging databases. Several feature extraction methods are used, including biologically-inspired features (BIF), local binary patterns (LBP), histogram of oriented gradients (HOG), and the Active Appearance Model (AAM). After applications of DR methods, a linear support vector machine (SVM) is deployed with gender classification accuracy rates exceeding 95% on MORPH-II, competitive with benchmark results. A parallel computational approach is also proposed, attaining faster processing speeds and similar recognition rates on MORPH-II. Our computational approach can be applied to practical gender classification systems and generalized to other face analysis tasks, such as race classification and age prediction.
On Tractable Computation of Expected Predictions
Khosravi, Pasha, Choi, YooJung, Liang, Yitao, Vergari, Antonio, Broeck, Guy Van den
Computing expected predictions has many interesting applications in areas such as fairness, handling missing values, and data analysis. Unfortunately, computing expectations of a discriminative model with respect to a probability distribution defined by an arbitrary generative model has been proven to be hard in general. In fact, the task is intractable even for simple models such as logistic regression and a naive Bayes distribution. In this paper, we identify a pair of generative and discriminative models that enables tractable computation of expectations of the latter with respect to the former, as well as moments of any order, in case of regression. Specifically, we consider expressive probabilistic circuits with certain structural constraints that support tractable probabilistic inference. Moreover, we exploit the tractable computation of high-order moments to derive an algorithm to approximate the expectations, for classification scenarios in which exact computations are intractable. We evaluate the effectiveness of our exact and approximate algorithms in handling missing data during prediction time where they prove to be competitive to standard imputation techniques on a variety of datasets. Finally, we illustrate how expected prediction framework can be used to reason about the behaviour of discriminative models.
Confederated Machine Learning on Horizontally and Vertically Separated Medical Data for Large-Scale Health System Intelligence
Liu, Dianbo, Miller, Timothy A, Mandl, Kenneth D.
Access to a large amount of high quality data is possibly the most important factor for success in advancing medicine with machine learning and data science. However, valuable healthcare data are usually distributed across isolated silos, and there are complex operational and regulatory concerns. Data on patient populations are often horizontally separated,each other across different practices and health systems. In addition, individual patient data are often vertically separated, by data type, across her sites of care, service, and testing. We train a confederated learning model in a manner to stratify elderly patients by their risk of a fall in the next two years, using diagnoses, medication claims data and clinical lab test records of patients.
Group-based Fair Learning Leads to Counter-intuitive Predictions
A number of machine learning (ML) methods have been proposed recently to maximize model predictive accuracy while enforcing notions of group parity or fairness across sub-populations. We propose a desirable property for these procedures, slack-consistency: For any individual, the predictions of the model should be monotonic with respect to allowed slack (i.e., maximum allowed group-parity violation). Such monotonicity can be useful for individuals to understand the impact of enforcing fairness on their predictions. Surprisingly, we find that standard ML methods for enforcing fairness violate this basic property. Moreover, this undesirable behavior arises in situations agnostic to the complexity of the underlying model or approximate optimizations, suggesting that the simple act of incorporating a constraint can lead to drastically unintended behavior in ML. We present a simple theoretical method for enforcing slack-consistency, while encouraging further discussions on the unintended behaviors potentially induced when enforcing group-based parity.
Unsupervised Representation for EHR Signals and Codes as Patient Status Vector
Darabi, Sajad, Kachuee, Mohammad, Sarrafzadeh, Majid
Effective modeling of electronic health records presents many challenges as they contain large amounts of irregularity most of which are due to the varying procedures and diagnosis a patient may have. Despite the recent progress in machine learning, unsupervised learning remains largely at open, especially in the healthcare domain. In this work, we present a two-step unsupervised representation learning scheme to summarize the multi-modal clinical time series consisting of signals and medical codes into a patient status vector. First, an auto-encoder step is used to reduce sparse medical codes and clinical time series into a distributed representation. Subsequently, the concatenation of the distributed representations is further fine-tuned using a forecasting task. We evaluate the usefulness of the representation on two downstream tasks: mortality and readmission. Our proposed method shows improved generalization performance for both short duration ICU visits and long duration ICU visits.
Predictive Analytics using Machine Learning
Below you will read in the training and test data which are already split for you to load separately. Then use unnest() from tidytext to create the tidy version with one word per record. Now that you have train and test data loaded and tidied, you can see how many songs exist per artist/author. Since the dataset has songs and book pages, I'll refer to them each as a document. The features that you will create are based on documents and their associated metadata, so it's important to understand this concept.
11 Important Model Evaluation Error Metrics Everyone should know
This article was originally published in February 2016 and updated in August 2019. The idea of building machine learning models works on a constructive feedback principle. You build a model, get feedback from metrics, make improvements and continue until you achieve a desirable accuracy. Evaluation metrics explain the performance of a model. An important aspect of evaluation metrics is their capability to discriminate among model results. I have seen plenty of analysts and aspiring data scientists not even bothering to check how robust their model is. Once they are finished building a model, they hurriedly map predicted values on unseen data. This is an incorrect approach. Simply building a predictive model is not your motive. It's about creating and selecting a model which gives high accuracy on out of sample data.