Goto

Collaborating Authors

 Performance Analysis


A Gentle Introduction to Self-Training and Semi-Supervised Learning

#artificialintelligence

When it comes to machine learning classification tasks, the more data available to train algorithms, the better. In supervised learning, this data must be labeled with respect to the target class -- otherwise, these algorithms wouldn't be able to learn the relationships between the independent and target variables. So, what if we only have enough time and money to label some of a large data set, and choose to leave the rest unlabeled? Can this unlabeled data somehow be used in a classification algorithm? This is where semi-supervised learning comes in.


On the Identification of Fair Auditors to Evaluate Recommender Systems based on a Novel Non-Comparative Fairness Notion

arXiv.org Artificial Intelligence

Decision-support systems are information systems that offer support to people's decisions in various applications such as judiciary, real-estate and banking sectors. Lately, these support systems have been found to be discriminatory in the context of many practical deployments. In an attempt to evaluate and mitigate these biases, algorithmic fairness literature has been nurtured using notions of comparative justice, which relies primarily on comparing two/more individuals or groups within the society that is supported by such systems. However, such a fairness notion is not very useful in the identification of fair auditors who are hired to evaluate latent biases within decision-support systems. As a solution, we introduce a paradigm shift in algorithmic fairness via proposing a new fairness notion based on the principle of non-comparative justice. Assuming that the auditor makes fairness evaluations based on some (potentially unknown) desired properties of the decision-support system, the proposed fairness notion compares the system's outcome with that of the auditor's desired outcome. We show that the proposed fairness notion also provides guarantees in terms of comparative fairness notions by proving that any system can be deemed fair from the perspective of comparative fairness (e.g. individual fairness and statistical parity) if it is non-comparatively fair with respect to an auditor who has been deemed fair with respect to the same fairness notions. We also show that the converse holds true in the context of individual fairness. A brief discussion is also presented regarding how our fairness notion can be used to identify fair and reliable auditors, and how we can use them to quantify biases in decision-support systems.


Machine Intelligence for Outcome Predictions of Trauma Patients During Emergency Department Care

arXiv.org Artificial Intelligence

Trauma mortality results from a multitude of non-linear dependent risk factors including patient demographics, injury characteristics, medical care provided, and characteristics of medical facilities; yet traditional approach attempted to capture these relationships using rigid regression models. We hypothesized that a transfer learning based machine learning algorithm could deeply understand a trauma patient's condition and accurately identify individuals at high risk for mortality without relying on restrictive regression model criteria. Anonymous patient visit data were obtained from years 2007-2014 of the National Trauma Data Bank. Patients with incomplete vitals, unknown outcome, or missing demographics data were excluded. All patient visits occurred in U.S. hospitals, and of the 2,007,485 encounters that were retrospectively examined, 8,198 resulted in mortality (0.4%). The machine intelligence model was evaluated on its sensitivity, specificity, positive and negative predictive value, and Matthews Correlation Coefficient. Our model achieved similar performance in age-specific comparison models and generalized well when applied to all ages simultaneously. While testing for confounding factors, we discovered that excluding fall-related injuries boosted performance for adult trauma patients; however, it reduced performance for children. The machine intelligence model described here demonstrates similar performance to contemporary machine intelligence models without requiring restrictive regression model criteria or extensive medical expertise.


Developing and Improving Risk Models using Machine-learning Based Algorithms

arXiv.org Machine Learning

The objective of this study is to develop a good risk model for classifying business delinquency by simultaneously exploring several machine learning based methods including regularization, hyper-parameter optimization, and model ensembling algorithms. The rationale under the analyses is firstly to obtain good base binary classifiers (include Logistic Regression ($LR$), K-Nearest Neighbors ($KNN$), Decision Tree ($DT$), and Artificial Neural Networks ($ANN$)) via regularization and appropriate settings of hyper-parameters. Then two model ensembling algorithms including bagging and boosting are performed on the good base classifiers for further model improvement. The models are evaluated using accuracy, Area Under the Receiver Operating Characteristic Curve (AUC of ROC), recall, and F1 score via repeating 10-fold cross-validation 10 times. The results show the optimal base classifiers along with the hyper-parameter settings are $LR$ without regularization, $KNN$ by using 9 nearest neighbors, $DT$ by setting the maximum level of the tree to be 7, and $ANN$ with three hidden layers. Bagging on $KNN$ with $K$ valued 9 is the optimal model we can get for risk classification as it reaches the average accuracy, AUC, recall, and F1 score valued 0.90, 0.93, 0.82, and 0.89, respectively.


Regularised Text Logistic Regression: Key Word Detection and Sentiment Classification for Online Reviews

arXiv.org Machine Learning

Online customer reviews have become important for managers and executives in the hospitality and catering industry who wish to obtain a comprehensive understanding of their customers' demands and expectations. We propose a Regularized Text Logistic (RTL) regression model to perform text analytics and sentiment classification on unstructured text data, which automatically identifies a set of statistically significant and operationally insightful word features, and achieves satisfactory predictive classification accuracy. We apply the RTL model to two online review datasets, Restaurant and Hotel, from TripAdvisor. Our results demonstrate satisfactory classification performance compared with alternative classifiers with a highest true positive rate of 94.9%. Moreover, RTL identifies a small set of word features, corresponding to 3% for Restaurant and 20% for Hotel, which boosts working efficiency by allowing managers to drill down into a much smaller set of important customer reviews. We also develop the consistency, sparsity and oracle property of the estimator.


Machine Learning: Some notes about Cross-Validation

#artificialintelligence

K-fold cross-validation is one of the most used cross-validation methods. In this method, k represents the number of experiments(or fold) that I want to try in order to test and train my data. For example, suppose that we want to make 5 experiments(or performance) with our data composed of 1000 records. So during the first experiment, we test or validate the first 200 records and then we train the remaining 800 records. When the first experiment is finished I obtain a certain accuracy.


Massachusetts suspends Boston-based coronavirus testing lab Orig3n after nearly 400 false positives

Boston Herald

The state has suspended Boston-based COVID-19 testing lab Orig3n Laboratory after it produced nearly 400 false positive results. Public health officials became aware in early August of an "unusually high positivity rate" among the lab's test results and requested that Orig3n stop testing for the virus as of Aug. 8. Specimens were sent to an independent lab to be retested as part of a state Department of Public Health investigation, and the results showed at least 383 false positives. On Aug. 27, the state Department of Public Health notified Orig3n of "three significant certification deficiencies that put patients at immediate risk of harm," according to a DPH spokeswoman. They included the failure of the lab's director to provide overall management, issues with the extraction phase of testing, and a failure to meet analytic requirements such as documenting the daily sanitizing of equipment used for coronavirus testing. A statement of deficiency was issued on Sept. 4. The lab must now respond with a written plan of correction by Sept. 14, "and if action is not taken it can face sanctions," DPH said.


Highly Accurate CNN Inference Using Approximate Activation Functions over Homomorphic Encryption

arXiv.org Machine Learning

In the big data era, cloud-based machine learning as a service (MLaaS) has attracted considerable attention. However, when handling sensitive data, such as financial and medical data, a privacy issue emerges, because the cloud server can access clients' raw data. A common method of handling sensitive data in the cloud uses homomorphic encryption, which allows computation over encrypted data without decryption. Previous research usually adopted a low-degree polynomial mapping function, such as the square function, for data classification. However, this technique results in low classification accuracy. In this study, we seek to improve the classification accuracy for inference processing in a convolutional neural network (CNN) while using homomorphic encryption. We adopt an activation function that approximates Google's Swish activation function while using a fourth-order polynomial. We also adopt batch normalization to normalize the inputs for the Swish function to fit the input range to minimize the error. We implemented CNN inference labeling over homomorphic encryption using the Microsoft's Simple Encrypted Arithmetic Library for the Cheon-Kim-Kim-Song (CKKS) scheme. The experimental evaluations confirmed classification accuracies of 99.22% and 80.48% for MNIST and CIFAR-10, respectively, which entails 0.04% and 4.11% improvements, respectively, over previous methods.


A Rigorous Machine Learning Analysis Pipeline for Biomedical Binary Classification: Application in Pancreatic Cancer Nested Case-control Studies with Implications for Bias Assessments

arXiv.org Machine Learning

Machine learning (ML) offers a collection of powerful approaches for detecting and modeling associations, often applied to data having a large number of features and/or complex associations. Currently, there are many tools to facilitate implementing custom ML analyses (e.g. scikit-learn). Interest is also increasing in automated ML packages, which can make it easier for non-experts to apply ML and have the potential to improve model performance. ML permeates most subfields of biomedical research with varying levels of rigor and correct usage. Tremendous opportunities offered by ML are frequently offset by the challenge of assembling comprehensive analysis pipelines, and the ease of ML misuse. In this work we have laid out and assembled a complete, rigorous ML analysis pipeline focused on binary classification (i.e. case/control prediction), and applied this pipeline to both simulated and real world data. At a high level, this 'automated' but customizable pipeline includes a) exploratory analysis, b) data cleaning and transformation, c) feature selection, d) model training with 9 established ML algorithms, each with hyperparameter optimization, and e) thorough evaluation, including appropriate metrics, statistical analyses, and novel visualizations. This pipeline organizes the many subtle complexities of ML pipeline assembly to illustrate best practices to avoid bias and ensure reproducibility. Additionally, this pipeline is the first to compare established ML algorithms to 'ExSTraCS', a rule-based ML algorithm with the unique capability of interpretably modeling heterogeneous patterns of association. While designed to be widely applicable we apply this pipeline to an epidemiological investigation of established and newly identified risk factors for pancreatic cancer to evaluate how different sources of bias might be handled by ML algorithms.


Quantifying Explainability of Saliency Methods in Deep Neural Networks

arXiv.org Artificial Intelligence

One way to achieve eXplainable artificial intelligence (XAI) is through the use of post-hoc analysis methods. In particular, methods that generate heatmaps have been used to explain black-box models, such as deep neural network. In some cases, heatmaps are appealing due to the intuitive and visual ways to understand them. However, quantitative analysis that demonstrates the actual potential of heatmaps have been lacking, and comparison between different methods are not standardized as well. In this paper, we introduce a synthetic data that can be generated adhoc along with the ground-truth heatmaps for better quantitative assessment. Each sample data is an image of a cell with easily distinguishable features, facilitating a more transparent assessment of different XAI methods. Comparison and recommendations are made, shortcomings are clarified along with suggestions for future research directions to handle the finer details of select post-hoc analysis methods.