precision and recall
Precision and Recall for Time Series
Classical anomaly detection is principally concerned with point-based anomalies, those anomalies that occur at a single point in time. Yet, many real-world anomalies are range-based, meaning they occur over a period of time. Motivated by this observation, we present a new mathematical model to evaluate the accuracy of time series classification algorithms. Our model expands the well-known Precision and Recall metrics to measure ranges, while simultaneously enabling customization support for domain-specific preferences.
Precision-Recall Divergence Optimization for Generative Modeling with GANs and Normalizing Flows
Achieving a balance between image quality (precision) and diversity (recall) is a significant challenge in the domain of generative models. Current state-of-the-art models primarily rely on optimizing heuristics, such as the Fr\'echet Inception Distance. While recent developments have introduced principled methods for evaluating precision and recall, they have yet to be successfully integrated into the training of generative models. Our main contribution is a novel training method for generative models, such as Generative Adversarial Networks and Normalizing Flows, which explicitly optimizes a user-defined trade-off between precision and recall. More precisely, we show that achieving a specified precision-recall trade-off corresponds to minimizing a unique $f$-divergence from a family we call the \mbox{\em PR-divergences}. Conversely, any $f$-divergence can be written as a linear combination of PR-divergences and corresponds to a weighted precision-recall trade-off. Through comprehensive evaluations, we show that our approach improves the performance of existing state-of-the-art models like BigGAN in terms of either precision or recall when tested on datasets such as ImageNet.
TopP&R: Robust Support Estimation Approach for Evaluating Fidelity and Diversity in Generative Models
We propose a robust and reliable evaluation metric for generative models called Topological Precision and Recall (TopP&R, pronounced "topper"), which systematically estimates supports by retaining only topologically and statistically significant features with a certain level of confidence. Existing metrics, such as Inception Score (IS), Frechet Inception Distance (FID), and various Precision and Recall (P&R) variants, rely heavily on support estimates derived from sample features. However, the reliability of these estimates has been overlooked, even though the quality of the evaluation hinges entirely on their accuracy. In this paper, we demonstrate that current methods not only fail to accurately assess sample quality when support estimation is unreliable, but also yield inconsistent results. In contrast, TopP&R reliably evaluates the sample quality and ensures statistical consistency in its results. Our theoretical and experimental findings reveal that TopP&R provides a robust evaluation, accurately capturing the true trend of change in samples, even in the presence of outliers and non-independent and identically distributed (Non-IID) perturbations where other methods result in inaccurate support estimations. To our knowledge, TopP&R is the first evaluation metric specifically focused on the robust estimation of supports, offering statistical consistency under noise conditions.
Quantum Machine Learning in Healthcare: Evaluating QNN and QSVM Models
Tudisco, Antonio, Volpe, Deborah, Turvani, Giovanna
Effective and accurate diagnosis of diseases such as cancer, diabetes, and heart failure is crucial for timely medical intervention and improving patient survival rates. Machine learning has revolutionized diagnostic methods in recent years by developing classification models that detect diseases based on selected features. However, these classification tasks are often highly imbalanced, limiting the performance of classical models. Quantum models offer a promising alternative, exploiting their ability to express complex patterns by operating in a higher-dimensional computational space through superposition and entanglement. These unique properties make quantum models potentially more effective in addressing the challenges of imbalanced datasets. This work evaluates the potential of quantum classifiers in healthcare, focusing on Quantum Neural Networks (QNNs) and Quantum Support Vector Machines (QSVMs), comparing them with popular classical models. The study is based on three well-known healthcare datasets -- Prostate Cancer, Heart Failure, and Diabetes. The results indicate that QSVMs outperform QNNs across all datasets due to their susceptibility to overfitting. Furthermore, quantum models prove the ability to overcome classical models in scenarios with high dataset imbalance. Although preliminary, these findings highlight the potential of quantum models in healthcare classification tasks and lead the way for further research in this domain.
- Health & Medicine > Therapeutic Area > Oncology (0.91)
- Health & Medicine > Therapeutic Area > Endocrinology > Diabetes (0.58)
Multi-VQC: A Novel QML Approach for Enhancing Healthcare Classification
Tudisco, Antonio, Volpe, Deborah, Turvani, Giovanna
--Accurate and reliable diagnosis of diseases is crucial in enabling timely medical treatment and enhancing patient survival rates. In recent years, Machine Learning has revolutionized diagnostic practices by creating classification models capable of identifying diseases. However, these classification problems often suffer from significant class imbalances, which can inhibit the effectiveness of traditional models. Therefore, the interest in Quantum models has arisen, driven by the captivating promise of overcoming the limitations of the classical counterpart thanks to their ability to express complex patterns by mapping data in a higher-dimensional computational space. This work proposes a novel approach for enhancing the classification performance of Quantum Neural Networks (QNN) consisting of multiple V ariational Quantum Circuits (VQCs) arranged sequentially. This strategy increases the nonlinearity of the model by exploiting the measurement operation and improving its ability to capture complex patterns.
What Is the Optimal Ranking Score Between Precision and Recall? We Can Always Find It and It Is Rarely $F_1$
Piérard, Sébastien, Deliège, Adrien, Van Droogenbroeck, Marc
Ranking methods or models based on their performance is of prime importance but is tricky because performance is fundamentally multidimensional. In the case of classification, precision and recall are scores with probabilistic interpretations that are both important to consider and complementary. The rankings induced by these two scores are often in partial contradiction. In practice, therefore, it is extremely useful to establish a compromise between the two views to obtain a single, global ranking. Over the last fifty years or so,it has been proposed to take a weighted harmonic mean, known as the F-score, F-measure, or $F_β$. Generally speaking, by averaging basic scores, we obtain a score that is intermediate in terms of values. However, there is no guarantee that these scores lead to meaningful rankings and no guarantee that the rankings are good tradeoffs between these base scores. Given the ubiquity of $F_β$ scores in the literature, some clarification is in order. Concretely: (1) We establish that $F_β$-induced rankings are meaningful and define a shortest path between precision- and recall-induced rankings. (2) We frame the problem of finding a tradeoff between two scores as an optimization problem expressed with Kendall rank correlations. We show that $F_1$ and its skew-insensitive version are far from being optimal in that regard. (3) We provide theoretical tools and a closed-form expression to find the optimal value for $β$ for any distribution or set of performances, and we illustrate their use on six case studies.
- Europe > Belgium > Wallonia > Liège Province > Liège (0.40)
- North America > United States > Tennessee > Davidson County > Nashville (0.04)
- North America > United States > Rhode Island > Providence County > Providence (0.04)
- (4 more...)
Predicting Talent Breakout Rate using Twitter and TV data
Batsaikhan, Bilguun, Fukuda, Hiroyuki
Early detection of rising talents is of paramount importance in the field of advertising. In this paper, we define a concept of talent breakout and propose a method to detect Japanese talents before their rise to stardom. The main focus of the study is to determine the effectiveness of combining Twitter and TV data on predicting time-dependent changes in social data. Although traditional time-series models are known to be robust in many applications, the success of neural network models in various fields (e.g.\ Natural Language Processing, Computer Vision, Reinforcement Learning) continues to spark an interest in the time-series community to apply new techniques in practice. Therefore, in order to find the best modeling approach, we have experimented with traditional, neural network and ensemble learning methods. We observe that ensemble learning methods outperform traditional and neural network models based on standard regression metrics. However, by utilizing the concept of talent breakout, we are able to assess the true forecasting ability of the models, where neural networks outperform traditional and ensemble learning methods in terms of precision and recall.
- Asia > Japan (0.05)
- North America > Trinidad and Tobago > Trinidad > Arima > Arima (0.04)
Precision and Recall for Time Series
Classical anomaly detection is principally concerned with point-based anomalies, those anomalies that occur at a single point in time. Yet, many real-world anomalies are range-based, meaning they occur over a period of time. Motivated by this observation, we present a new mathematical model to evaluate the accuracy of time series classification algorithms. Our model expands the well-known Precision and Recall metrics to measure ranges, while simultaneously enabling customization support for domain-specific preferences.