Performance Analysis
Spy agencies have big hopes for AI
WHEN IT COMES to artificial intelligence (AI), spy agencies have been at it longer than most. In the cold war, America's National Security Agency (NSA) and Britain's Government Communications Headquarters (GCHQ) explored early AI to help transcribe and translate the enormous volumes of Soviet phone-intercepts they began hoovering up in the 1960s. Yet the technology was immature. One former European intelligence officer says his service did not use automatic transcription or translation in Afghanistan in the 2000s, relying on native speakers instead. Now the spooks are hoping to do better. The trends that have made AI attractive for business--more data, better algorithms, and more processing power to make it all hum--are giving spy agencies big ideas, too.
Popular Machine Learning Interview Questions, part 2 - KDnuggets
This article is part 2 of my Popular Machine Learning Interview questions. Here I feature more questions I usually see asked during interviews. I shall note that this isn't an interview prep guide nor a conclusive list of all questions. Rather, you should use this article as a refresher for your Machine Learning knowledge. I suggest reading the question then try to answer it yourself before reading the answer.
Probabilistic combination of eigenlungs-based classifiers for COVID-19 diagnosis in chest CT images
Arco, Juan E., Ortiz, Andrés, Ramírez, Javier, Martínez-Murcia, Francisco J., Zhang, Yu-Dong, Broncano, Jordi, Berbís, M. Álvaro, Royuela-del-Val, Javier, Luna, Antonio, Górriz, Juan M.
The outbreak of the COVID-19 (Coronavirus disease 2019) pandemic has changed the world. According to the World Health Organization (WHO), there have been more than 100 million confirmed cases of COVID-19, including more than 2.4 million deaths. It is extremely important the early detection of the disease, and the use of medical imaging such as chest X-ray (CXR) and chest Computed Tomography (CCT) have proved to be an excellent solution. However, this process requires clinicians to do it within a manual and time-consuming task, which is not ideal when trying to speed up the diagnosis. In this work, we propose an ensemble classifier based on probabilistic Support Vector Machine (SVM) in order to identify pneumonia patterns while providing information about the reliability of the classification. Specifically, each CCT scan is divided into cubic patches and features contained in each one of them are extracted by applying kernel PCA. The use of base classifiers within an ensemble allows our system to identify the pneumonia patterns regardless of their size or location. Decisions of each individual patch are then combined into a global one according to the reliability of each individual classification: the lower the uncertainty, the higher the contribution. Performance is evaluated in a real scenario, yielding an accuracy of 97.86%. The large performance obtained and the simplicity of the system (use of deep learning in CCT images would result in a huge computational cost) evidence the applicability of our proposal in a real-world environment.
Calibrated Simplex Mapping Classification
Heese, Raoul, Walczak, Michał, Bortz, Michael, Schmid, Jochen
In many supervised learning applications, it is not sufficient to know the most probable class y for a certain data point x. Instead, a well-calibrated probabilistic prediction p(y x) is required. For instance, in clinical applications, class probabilities are important for confidence in model predictions (Challis et al., 2015). Some classifiers intrinsically provide such a posterior probability, e. g. logistic regression or Gaussian process classification (GPC) as described in Rasmussen and Williams (2006). There are also various methods to install or improve such a calibration for a given classification approach (Niculescu-Mizil and Caruana, 2005), like Platt scaling (Platt, 2000) or isotonic regression (Zadrozny and Elkan, 2002).
Fairness in Credit Scoring: Assessment, Implementation and Profit Implications
Kozodoi, Nikita, Jacob, Johannes, Lessmann, Stefan
The rise of algorithmic decision-making has spawned much research on fair machine learning (ML). Financial institutions use ML for building risk scorecards that support a range of credit-related decisions. Yet, the literature on fair ML in credit scoring is scarce. The paper makes two contributions. First, we provide a systematic overview of algorithmic options for incorporating fairness goals in the ML model development pipeline. In this scope, we also consolidate the space of statistical fairness criteria and examine their adequacy for credit scoring. Second, we perform an empirical study of different fairness processors in a profit-oriented credit scoring setup using seven real-world data sets. The empirical results substantiate the evaluation of fairness measures, identify more and less suitable options to implement fair credit scoring, and clarify the profit-fairness trade-off in lending decisions. Specifically, we find that multiple fairness criteria can be approximately satisfied at once and identify separation as a proper criterion for measuring the fairness of a scorecard. We also find fair in-processors to deliver a good balance between profit and fairness. More generally, we show that algorithmic discrimination can be reduced to a reasonable level at a relatively low cost.
A Comparative Evaluation of Quantification Methods
Schumacher, Tobias, Strohmaier, Markus, Lemmerich, Florian
Quantification represents the problem of predicting class distributions in a given target set. It also represents a growing research field in supervised machine learning, for which a large variety of different algorithms has been proposed in recent years. However, a comprehensive empirical comparison of quantification methods that supports algorithm selection is not available yet. In this work, we close this research gap by conducting a thorough empirical performance comparison of 24 different quantification methods. To consider a broad range of different scenarios for binary as well as multiclass quantification settings, we carried out almost 3 million experimental runs on 40 data sets. We observe that no single algorithm generally outperforms all competitors, but identify a group of methods including the Median Sweep and the DyS framework that perform significantly better in binary settings. For the multiclass setting, we observe that a different, broad group of algorithms yields good performance, including the Generalized Probabilistic Adjusted Count, the readme method, the energy distance minimization method, the EM algorithm for quantification, and Friedman's method. More generally, we find that the performance on multiclass quantification is inferior to the results obtained in the binary setting. Our results can guide practitioners who intend to apply quantification algorithms and help researchers to identify opportunities for future research.
Bad and good errors: value-weighted skill scores in deep ensemble learning
Guastavino, Sabrina, Piana, Michele, Benvenuto, Federico
In this paper we propose a novel approach to realize forecast verification. Specifically, we introduce a strategy for assessing the severity of forecast errors based on the evidence that, on the one hand, a false alarm just anticipating an occurring event is better than one in the middle of consecutive non-occurring events, and that, on the other hand, a miss of an isolated event has a worse impact than a miss of a single event, which is part of several consecutive occurrences. Relying on this idea, we introduce a novel definition of confusion matrix and skill scores giving greater importance to the value of the prediction rather than to its quality. Then, we introduce a deep ensemble learning procedure for binary classification, in which the probabilistic outcomes of a neural network are clustered via optimization of these value-weighted skill scores. We finally show the performances of this approach in the case of three applications concerned with pollution, space weather and stock prize forecasting.
Label-Imbalanced and Group-Sensitive Classification under Overparameterization
Kini, Ganesh Ramachandra, Paraskevas, Orestis, Oymak, Samet, Thrampoulidis, Christos
Label-imbalanced and group-sensitive classification seeks to appropriately modify standard training algorithms to optimize relevant metrics such as balanced error and/or equal opportunity. For label imbalances, recent works have proposed a logit-adjusted loss modification to standard empirical risk minimization. We show that this might be ineffective in general and, in particular so, in the overparameterized regime where training continues in the zero training-error regime. Specifically for binary linear classification of a separable dataset, we show that the modified loss converges to the max-margin SVM classifier despite the logit adjustment. Instead, we propose a more general vector-scaling loss that directly relates to the cost-sensitive SVM (CS-SVM), thus favoring larger margin to the minority class. Through an insightful sharp asymptotic analysis for a Gaussian-mixtures data model, we demonstrate the efficacy of CS-SVM in balancing the errors of the minority/majority classes. Our analysis also leads to a simple strategy for optimally tuning the involved margin-ratio parameter. Then, we show how our results extend naturally to binary classification with sensitive groups, thus treating the two common types of imbalances (label/group) in a unifying way. We corroborate our theoretical findings with numerical experiments on both synthetic and real-world datasets.
Sequential Place Learning: Heuristic-Free High-Performance Long-Term Place Recognition
Chancán, Marvin, Milford, Michael
Sequential matching using hand-crafted heuristics has been standard practice in route-based place recognition for enhancing pairwise similarity results for nearly a decade. However, precision-recall performance of these algorithms dramatically degrades when searching on short temporal window (TW) lengths, while demanding high compute and storage costs on large robotic datasets for autonomous navigation research. Here, influenced by biological systems that robustly navigate spacetime scales even without vision, we develop a joint visual and positional representation learning technique, via a sequential process, and design a learning-based CNN+LSTM architecture, trainable via backpropagation through time, for viewpoint- and appearance-invariant place recognition. Our approach, Sequential Place Learning (SPL), is based on a CNN function that visually encodes an environment from a single traversal, thus reducing storage capacity, while an LSTM temporally fuses each visual embedding with corresponding positional data -- obtained from any source of motion estimation -- for direct sequential inference. Contrary to classical two-stage pipelines, e.g., match-then-temporally-filter, our network directly eliminates false-positive rates while jointly learning sequence matching from a single monocular image sequence, even using short TWs. Hence, we demonstrate that our model outperforms 15 classical methods while setting new state-of-the-art performance standards on 4 challenging benchmark datasets, where one of them can be considered solved with recall rates of 100% at 100% precision, correctly matching all places under extreme sunlight-darkness changes. In addition, we show that SPL can be up to 70x faster to deploy than classical methods on a 729 km route comprising 35,768 consecutive frames. Extensive experiments demonstrate the... Baseline code available at https://github.com/mchancan/deepseqslam