Accuracy
Improving Fairness and Privacy in Selection Problems
Khalili, Mohammad Mahdi, Zhang, Xueru, Abroshan, Mahed, Sojoudi, Somayeh
Supervised learning models have been increasingly used for making decisions about individuals in applications such as hiring, lending, and college admission. These models may inherit pre-existing biases from training datasets and discriminate against protected attributes (e.g., race or gender). In addition to unfairness, privacy concerns also arise when the use of models reveals sensitive personal information. Among various privacy notions, differential privacy has become popular in recent years. In this work, we study the possibility of using a differentially private exponential mechanism as a post-processing step to improve both fairness and privacy of supervised learning models. Unlike many existing works, we consider a scenario where a supervised model is used to select a limited number of applicants as the number of available positions is limited. This assumption is well-suited for various scenarios, such as job application and college admission. We use ``equal opportunity'' as the fairness notion and show that the exponential mechanisms can make the decision-making process perfectly fair. Moreover, the experiments on real-world datasets show that the exponential mechanism can improve both privacy and fairness, with a slight decrease in accuracy compared to the model without post-processing.
AI-enabled Prediction of eSports Player Performance Using the Data from Heterogeneous Sensors
Smerdov, Anton, Burnaev, Evgeny, Somov, Andrey
The emerging progress of eSports lacks the tools for ensuring high-quality analytics and training in Pro and amateur eSports teams. We report on an Artificial Intelligence (AI) enabled solution for predicting the eSports player in-game performance using exclusively the data from sensors. For this reason, we collected the physiological, environmental, and the game chair data from Pro and amateur players. The player performance is assessed from the game logs in a multiplayer game for each moment of time using a recurrent neural network. We have investigated that attention mechanism improves the generalization of the network and provides the straightforward feature importance as well. The best model achieves ROC AUC score 0.73. The prediction of the performance of particular player is realized although his data are not utilized in the training set. The proposed solution has a number of promising applications for Pro eSports teams as well as a learning tool for amateur players.
Learning to Separate Clusters of Adversarial Representations for Robust Adversarial Detection
Joe, Byunggill, Hamm, Jihun, Hwang, Sung Ju, Son, Sooel, Shin, Insik
Although deep neural networks have shown promising performances on various tasks, they are susceptible to incorrect predictions induced by imperceptibly small perturbations in inputs. A large number of previous works proposed to detect adversarial attacks. Yet, most of them cannot effectively detect them against adaptive whitebox attacks where an adversary has the knowledge of the model and the defense method. In this paper, we propose a new probabilistic adversarial detector motivated by a recently introduced non-robust feature. We consider the non-robust features as a common property of adversarial examples, and we deduce it is possible to find a cluster in representation space corresponding to the property. This idea leads us to probability estimate distribution of adversarial representations in a separate cluster, and leverage the distribution for a likelihood based adversarial detector.
Machine-Learning Arithmetic Curves
He, Yang-Hui, Lee, Kyu-Hwan, Oliver, Thomas
We show that standard machine-learning algorithms may be trained to predict certain invariants of low genus arithmetic curves. Using datasets of size around one hundred thousand, we demonstrate the utility of machine-learning in classification problems pertaining to the BSD invariants of an elliptic curve (including its rank and torsion subgroup), and the analogous invariants of a genus 2 curve. Our results show that a trained machine can efficiently classify curves according to these invariants with high accuracies (>0.97). For problems such as distinguishing between torsion orders, and the recognition of integral points, the accuracies can reach 0.998.
A predictive model for kidney transplant graft survival using machine learning
Pahl, Eric S., Street, W. Nick, Johnson, Hans J., Reed, Alan I.
Kidney transplantation is the best treatment for end-stage renal failure patients. The predominant method used for kidney quality assessment is the Cox regression-based, kidney donor risk index. A machine learning method may provide improved prediction of transplant outcomes and help decision-making. A popular tree-based machine learning method, random forest, was trained and evaluated with the same data originally used to develop the risk index (70,242 observations from 1995-2005). The random forest successfully predicted an additional 2,148 transplants than the risk index with equal type II error rates of 10%. Predicted results were analyzed with follow-up survival outcomes up to 240 months after transplant using Kaplan-Meier analysis and confirmed that the random forest performed significantly better than the risk index (p<0.05). The random forest predicted significantly more successful and longer-surviving transplants than the risk index. Random forests and other machine learning models may improve transplant decisions.
Machine Learning and Credit Risk Modelling
Machine Learning (ML) algorithms leverage large datasets to determine patterns and construct meaningful recommendations. Likewise, credit risk modelling is a field with access to a large amount of diverse data where ML can be deployed to add analytical value. In the following analysis, we explore how various ML techniques can be used for assessing probability of default (PD) and compare their performance in a real-world setting. A recent publication by the Bank of England (BoE) and the Financial Conduct Authority (FCA) reports the results of a survey on the use of ML in United Kingdom (UK) financial services.[1] Results show that two-thirds of respondents use ML in some form.
Brain Co-Processors: Using AI to Restore and Augment Brain Function
Brain-computer interfaces (BCIs) use decoding algorithms to control prosthetic devices based on brain signals for restoration of lost function. Computer-brain interfaces (CBIs), on the other hand, use encoding algorithms to transform external sensory signals into neural stimulation patterns for restoring sensation or providing sensory feedback for closed-loop prosthetic control. In this article, we introduce brain co-processors, devices that combine decoding and encoding in a unified framework using artificial intelligence (AI) to supplement or augment brain function. Brain co-processors can be used for a range of applications, from inducing Hebbian plasticity for rehabilitation after brain injury to reanimating paralyzed limbs and enhancing memory. A key challenge is simultaneous multi-channel neural decoding and encoding for optimization of external behavioral or task-related goals. We describe a new framework for developing brain co-processors based on artificial neural networks, deep learning and reinforcement learning. These "neural co-processors" allow joint optimization of cost functions with the nervous system to achieve desired behaviors. By coupling artificial neural networks with their biological counterparts, neural co-processors offer a new way of restoring and augmenting the brain, as well as a new scientific tool for brain research. We conclude by discussing the potential applications and ethical implications of brain co-processors.
Over a Decade of Social Opinion Mining
Social media popularity and importance is on the increase, due to people using it for various types of social interaction across multiple channels. This social interaction by online users includes submission of feedback, opinions and recommendations about various individuals, entities, topics, and events. This systematic review focuses on the evolving research area of Social Opinion Mining, tasked with the identification of multiple opinion dimensions, such as subjectivity, sentiment polarity, emotion, affect, sarcasm and irony, from user-generated content represented across multiple social media platforms and in various media formats, like text, image, video and audio. Therefore, through Social Opinion Mining, natural language can be understood in terms of the different opinion dimensions, as expressed by humans. This contributes towards the evolution of Artificial Intelligence, which in turn helps the advancement of several real-world use cases, such as customer service and decision making. A thorough systematic review was carried out on Social Opinion Mining research which totals 485 studies and spans a period of twelve years between 2007 and 2018. The in-depth analysis focuses on the social media platforms, techniques, social datasets, language, modality, tools and technologies, natural language processing tasks and other aspects derived from the published studies. Such multi-source information fusion plays a fundamental role in mining of people's social opinions from social media platforms. These can be utilised in many application areas, ranging from marketing, advertising and sales for product/service management, and in multiple domains and industries, such as politics, technology, finance, healthcare, sports and government. Future research directions are presented, whereas further research and development has the potential of leaving a wider academic and societal impact.
Robust Deep AUC Maximization: A New Surrogate Loss and Empirical Studies on Medical Image Classification
Yuan, Zhuoning, Yan, Yan, Sonka, Milan, Yang, Tianbao
Deep AUC Maximization (DAM) is a paradigm for learning a deep neural network by maximizing the AUC score of the model on a dataset. Most previous works of AUC maximization focus on the perspective of optimization by designing efficient stochastic algorithms, and studies on generalization performance of DAM on difficult tasks are missing. In this work, we aim to make DAM more practical for interesting real-world applications (e.g., medical image classification). First, we propose a new margin-based surrogate loss function for the AUC score (named as the AUC margin loss). It is more robust than the commonly used AUC square loss, while enjoying the same advantage in terms of large-scale stochastic optimization. Second, we conduct empirical studies of our DAM method on difficult medical image classification tasks, namely classification of chest x-ray images for identifying many threatening diseases and classification of images of skin lesions for identifying melanoma. Our DAM method has achieved great success on these difficult tasks, i.e., the 1st place on Stanford CheXpert competition (by the paper submission date) and Top 1% rank (rank 33 out of 3314 teams) on Kaggle 2020 Melanoma classification competition. We also conduct extensive ablation studies to demonstrate the advantages of the new AUC margin loss over the AUC square loss on benchmark datasets. To the best of our knowledge, this is the first work that makes DAM succeed on large-scale medical image datasets.
Amazon takes top three spots in Audio Anomaly Detection Challenge
This week at Amazon Web Services' re:Invent 2020 conference, Amazon announced Amazon Monitron, an end-to-end machine-monitoring system composed of sensors, a gateway, and a machine learning model that detects anomalies in vibration (structure-borne sound) or temperature and predicts when equipment may require maintenance. Machine condition monitoring was also the topic of a challenge at the Workshop on the Detection and Classification of Acoustic Scenes and Events (DCASE 2020), in November, in which Amazon took the top three spots, out of 117 submissions. The challenge was to determine whether the sounds emitted by a machine -- such as a fan, pump, or valve -- were normal or anomalous. Forty academic and industry teams submitted entries, an average of almost three submissions per team. In a pair of papers (paper 1 paper 2) we presented at the workshop, we describe the two different neural-network-based approaches we took in our submissions to the challenge.