Performance Analysis
Can graph machine learning identify hate speech in online social networks?
Over three decades, the Internet has grown from a small network of computers used by research scientists to communicate and exchange data to a technology that has penetrated almost every aspect of our day-to-day lives. Today, it is hard to imagine a life without online access for doing business, shopping, and socialising. A technology that has connected humanity at a scale never before possible has also amplified some of our worst qualities. Online hate speech spreads virally across the globe with short and long term consequences for individuals and societies. These consequences are often difficult to measure and predict. Online social media websites and mobile apps have inadvertently become the platform for the spread and proliferation of hate speech. "Hate speech is a type of speech that takes place online (e.g., the Internet, online social media platforms) with the purpose to attack a person or a group on the basis of attributes such as race, religion, ethnic origin, sexual orientation, disability, or gender."
Machine Learning approach for Credit Scoring
Provenzano, A. R., Trifirรฒ, D., Datteo, A., Giada, L., Jean, N., Riciputi, A., Pera, G. Le, Spadaccino, M., Massaron, L., Nordio, C.
In this work we build a stack of machine learning models aimed at composing a state-of-the-art credit rating and default prediction system, obtaining excellent out-of-sample performances. Our approach is an excursion through the most recent ML / AI concepts, starting from natural language processes (NLP) applied to economic sectors' (textual) descriptions using embedding and autoencoders (AE), going through the classification of defaultable firms on the base of a wide range of economic features using gradient boosting machines (GBM) and calibrating their probabilities paying due attention to the treatment of unbalanced samples. Finally we assign credit ratings through genetic algorithms (differential evolution, DE). Model interpretability is achieved by implementing recent techniques such as SHAP and LIME, which explain predictions locally in features' space.
The multilayer random dot product graph
Jones, Andrew, Rubin-Delanchy, Patrick
We present an extension of the latent position network model known as the generalised random dot product graph to accommodate multiple graphs with a common node structure, based on a matrix representation of the natural third-order tensor created from the adjacency matrices of these graphs. Theoretical results concerning the asymptotic behaviour of the node representations obtained by spectral embedding are established, showing that after the application of a linear transformation these converge uniformly in the Euclidean norm to the latent positions with a Gaussian error. The flexibility of the model is demonstrated through application to the tasks of latent position recovery and two-graph hypothesis testing, in which it performs favourably compared to existing models. Empirical improvements in link prediction over single graph embeddings are exhibited in a cyber-security example.
An Empirical Characterization of Fair Machine Learning For Clinical Risk Prediction
Pfohl, Stephen R., Foryciarz, Agata, Shah, Nigam H.
The use of machine learning to guide clinical decision making has the potential to worsen existing health disparities. Several recent works frame the problem as that of algorithmic fairness, a framework that has attracted considerable attention and criticism. However, the appropriateness of this framework is unclear due to both ethical as well as technical considerations, the latter of which include trade-offs between measures of fairness and model performance that are not well-understood for predictive models of clinical outcomes. To inform the ongoing debate, we conduct an empirical study to characterize the impact of penalizing group fairness violations on an array of measures of model performance and group fairness. We repeat the analyses across multiple observational healthcare databases, clinical outcomes, and sensitive attributes. We find that procedures that penalize differences between the distributions of predictions across groups induce nearly-universal degradation of multiple performance metrics within groups. On examining the secondary impact of these procedures, we observe heterogeneity of the effect of these procedures on measures of fairness in calibration and ranking across experimental conditions. Beyond the reported trade-offs, we emphasize that analyses of algorithmic fairness in healthcare lack the contextual grounding and causal awareness necessary to reason about the mechanisms that lead to health disparities, as well as about the potential of algorithmic fairness methods to counteract those mechanisms. In light of these limitations, we encourage researchers building predictive models for clinical use to step outside the algorithmic fairness frame and engage critically with the broader sociotechnical context surrounding the use of machine learning in healthcare.
Crowd, Lending, Machine, and Bias
Fu, Runshan, Huang, Yan, Singh, Param Vir
Big data and machine learning (ML) algorithms are key drivers of many fintech innovations. While it may be obvious that replacing humans with machine would increase efficiency, it is not clear whether and where machines can make better decisions than humans. We answer this question in the context of crowd lending, where decisions are traditionally made by a crowd of investors. Using data from Prosper.com, we show that a reasonably sophisticated ML algorithm predicts listing default probability more accurately than crowd investors. The dominance of the machine over the crowd is more pronounced for highly risky listings. We then use the machine to make investment decisions, and find that the machine benefits not only the lenders but also the borrowers. When machine prediction is used to select loans, it leads to a higher rate of return for investors and more funding opportunities for borrowers with few alternative funding options. We also find suggestive evidence that the machine is biased in gender and race even when it does not use gender and race information as input. We propose a general and effective "debasing" method that can be applied to any prediction focused ML applications, and demonstrate its use in our context. We show that the debiased ML algorithm, which suffers from lower prediction accuracy, still leads to better investment decisions compared with the crowd. These results indicate that ML can help crowd lending platforms better fulfill the promise of providing access to financial resources to otherwise underserved individuals and ensure fairness in the allocation of these resources.
On Controllability of AI
The unprecedented progress in Artificial Intelligence (AI) [1-6], over the last decade, came alongside of multiple AI failures [7, 8] and cases of dual use [9] causing a realization [10] that it is not sufficient to create highly capable machines, but that it is even more important to make sure that intelligent machines are beneficial [11] for the humanity. This lead to the birth of the new subfield of research commonly known as AI Safety and Security [12] with hundreds of papers and books published annually on different aspects of the problem [13-31]. All such research is done under the assumption that the problem of controlling highly capable intelligent machines is solvable, which has not been established by any rigorous means. However, it is a standard practice in computer science to first show that a problem doesn't belong to a class of unsolvable problems [32, 33] before investing resources into trying to solve it or deciding what approaches to try. Unfortunately, to the best of our knowledge no mathematical proof or even rigorous argumentation has been published demonstrating that the AI control problem may be solvable, even in principle, much less in practice. Or as Gans puts it citing Bostrom: "Thusfar, AI researchers and philosophers have not been able to come up with methods of control that would ensure [bad] outcomes did not take place โฆ" [34].
A Distributionally Robust Approach to Fair Classification
Taskesen, Bahar, Nguyen, Viet Anh, Kuhn, Daniel, Blanchet, Jose
We propose a distributionally robust logistic regression model with an unfairness penalty that prevents discrimination with respect to sensitive attributes such as gender or ethnicity. This model is equivalent to a tractable convex optimization problem if a Wasserstein ball centered at the empirical distribution on the training data is used to model distributional uncertainty and if a new convex unfairness measure is used to incentivize equalized opportunities. We demonstrate that the resulting classifier improves fairness at a marginal loss of predictive accuracy on both synthetic and real datasets. We also derive linear programming-based confidence bounds on the level of unfairness of any pre-trained classifier by leveraging techniques from optimal uncertainty quantification over Wasserstein balls.
Can we Estimate Truck Accident Risk from Telemetric Data using Machine Learning?
Hรฉbert, Antoine, Marineau, Ian, Gervais, Gilles, Glatard, Tristan, Jaumard, Brigitte
Road accidents have a high societal cost that could be reduced through improved risk predictions using machine learning. This study investigates whether telemetric data collected on long-distance trucks can be used to predict the risk of accidents associated with a driver. We use a dataset provided by a truck transportation company containing the driving data of 1,141 drivers for 18 months. We evaluate two different machine learning approaches to perform this task. In the first approach, features are extracted from the time series data using the FRESH algorithm and then used to estimate the risk using Random Forests. In the second approach, we use a convolutional neural network to directly estimate the risk from the time-series data. We find that neither approach is able to successfully estimate the risk of accidents on this dataset, in spite of many methodological attempts. We discuss the difficulties of using telemetric data for the estimation of the risk of accidents that could explain this negative result.
Dealing with Nuisance Parameters using Machine Learning in High Energy Physics: a Review
Dorigo, Tommaso, de Castro, Pablo
Of these, probably the most common is the use of supervised classification to construct low-dimensional event summaries, which are informative to carry out statistical inference for a given set of parameters of interest. The learned summary statistics -functions of the data that are informative on their relevant properties-can efficiently combine high-dimensional information from each event into one or a few variables which can be used as the basis of statistical inference. The informational source for this compression are simulated observations produced by a complex generative model; the latter reproduces the chain of physical processes occurring in subatomic collisions and the subsequent interaction of the produced final state particles with the detection elements.
Multi-Classifier selection-fusion framework: application to NDT of complex metallic parts
Yaghoubi, Vahid, Cheng, Liangliang, Van Paepegem, Wim, Kersemans, Mathias
Recent advances in computational methods, material science, and manufacturing technologies reveal promising potentials for using geometrically complex parts to optimize the performance of structural systems. However, this potential has not yet been activated partly due to the immaturity of nondestructive testing (NDT) of such complex parts. Process compensated resonance testing (PCRT) is one of the methods that are in the focus of researchers for this purpose. The key to success for the PCRT approach is to use high-frequency vibration data in conjunction with statistical pattern recognition methods for supervised classification of parts in terms of their structural quality. In this paper, a multi classifier selection-fusion framework based on the Dempster-Shafer theory is proposed. Two new weighting approaches are introduced to enhance the fusion performance, and as such the classification performance. The effectiveness of the proposed framework is validated by its application to six UCI machine learning datasets and one experimental dataset collected from polycrystalline Nickel alloy first-stage turbine blades with a variety of damage features. Comparison with four state-of-the-art fusion techniques shows the good performance of the introduced classifier selection-fusion framework.