AITopics | aucpr

Collaborating Authors

aucpr

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Supplementary Material for Anomaly Detection Benchmark

Neural Information Processing SystemsAug-19-2025, 02:21:54 GMT

We implement several representative supervised classification algorithms in ADBench (as shown in Appx. B.1), and recommend interesting readers to recent machine learning books [ To this end, some recent studies investigate efficiently using partially labeled data for improving detection performance, and leverage the unlabeled data to facilitate representation learning. As we show in Table 1, there is a line of existing AD benchmarks. A GAN-based method that defines the reconstruction error of the input instance as the anomaly score. The hidden size of REPEN is set to 20, and the margin of triplet loss is set to 1000.

algorithm, anomaly, dataset, (14 more...)

Neural Information Processing Systems

Genre: Research Report (0.65)

Technology:

Information Technology > Data Science > Data Mining > Anomaly Detection (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.67)

Add feedback

Exploring the Impact of Outlier Variability on Anomaly Detection Evaluation Metrics

Ok, Minjae, Klüttermann, Simon, Müller, Emmanuel

arXiv.org Artificial IntelligenceSep-24-2024

Anomaly detection is a dynamic field, in which the evaluation of models plays a critical role in understanding their effectiveness. The selection and interpretation of the evaluation metrics are pivotal, particularly in scenarios with varying amounts of anomalies. This study focuses on examining the behaviors of three widely used anomaly detection metrics under different conditions: F1 score, Receiver Operating Characteristic Area Under Curve (ROC AUC), and Precision-Recall Curve Area Under Curve (AUCPR). Our study critically analyzes the extent to which these metrics provide reliable and distinct insights into model performance, especially considering varying levels of outlier fractions and contamination thresholds in datasets. Through a comprehensive experimental setup involving widely recognized algorithms for anomaly detection, we present findings that challenge the conventional understanding of these metrics and reveal nuanced behaviors under varying conditions. We demonstrated that while the F1 score and AUCPR are sensitive to outlier fractions, the ROC AUC maintains consistency and is unaffected by such variability. Additionally, under conditions of a fixed outlier fraction in the test set, we observe an alignment between ROC AUC and AUCPR, indicating that the choice between these two metrics may be less critical in such scenarios. The results of our study contribute to a more refined understanding of metric selection and interpretation in anomaly detection, offering valuable insights for both researchers and practitioners in the field.

anomaly detection, outlier fraction, roc auc, (11 more...)

arXiv.org Artificial Intelligence

2409.15986

Country:

Europe > Germany > North Rhine-Westphalia > Arnsberg Region > Dortmund (0.04)
North America > United States > New York > New York County > New York City (0.04)
North America > United States > Massachusetts > Suffolk County > Boston (0.04)

Genre: Research Report > New Finding (0.94)

Technology:

Information Technology > Data Science > Data Mining > Anomaly Detection (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (1.00)

Add feedback

MercurialMonkey/Harvard-University-Capstone-Project-Data-Science

#artificialintelligenceJun-22-2019, 18:22:59 GMT

I have submitted my own project using a dataset of my choosing. My project has been reviewed both by my peers and the professor. I chose to work with Credit Card Fraud Detection, It is important that credit card companies are able to recognize fraudulent credit card transactions so that customers are not charged for items that they did not purchase. The datasets contains transactions made by credit cards in September 2013 by european cardholders. Due to imbalancing nature of the data, many observations could be predicted as False Negative, in this case Legal Transactions instead of Fraudolent Transaction.

artificial intelligence, aucpr, machine learning, (1 more...)

#artificialintelligence

Industry: Banking & Finance > Credit (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.62)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.42)

Add feedback

Prediction of Workplace Injuries

Sadeqi, Mehdi, Asgarian, Azin, Sibilia, Ariel

arXiv.org Machine LearningJun-4-2019

Workplace injuries result in substantial human and financial losses. As reported by the International Labour Organization (ILO), there are more than 374 million work-related injuries reported every year. In this study, we investigate the problem of injury risk prediction and prevention in a work environment. While injuries represent a significant number across all organizations, they are rare events within a single organization. Hence, collecting a sufficiently large dataset from a single organization is extremely difficult. In addition, the collected datasets are often highly imbalanced which increases the problem difficulty. Finally, risk predictions need to provide additional context for injuries to be prevented. We propose and evaluate the following for a complete solution: 1) several ensemble-based resampling methods to address the class imbalance issues, 2) a novel transfer learning approach to transfer the knowledge across organizations, and 3) various techniques to uncover the association and causal effect of different variables on injury risk, while controlling for relevant confounding factors.

artificial intelligence, injury, machine learning, (18 more...)

arXiv.org Machine Learning

1906.0308

Country:

North America > Canada > Ontario > Toronto (0.04)
North America > United States (0.04)
Asia > India (0.04)

Genre: Research Report > New Finding (0.66)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)

Add feedback

Anomaly Detection with Isolation Forests using H2O - Open Source Leader in AI and ML

#artificialintelligenceNov-28-2018, 03:24:47 GMT

Anomaly detection is a common data science problem where the goal is to identify odd or suspicious observations, events, or items in our data that might be indicative of some issues in our data collection process (such as broken sensors, typos in collected forms, etc.) or unexpected events like security breaches, server failures, and so on. Anomaly detection can be performed in a supervised, semi-supervised, and unsupervised manner. For a supervised approach, we need to know whether each observation, event or item is anomalous or genuine, and we use this information during training. Obtaining labels for each observation might often be unrealistic. A semi-supervised approach uses the assumption that we only know which observations are genuine, non-anomalous, and we do not have any information on the anomalous observations.

data mining, detection, machine learning, (15 more...)

#artificialintelligence

Technology:

Information Technology > Data Science > Data Mining > Anomaly Detection (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.36)

Add feedback

Scalable Learning of Non-Decomposable Objectives

Eban, Elad ET., Schain, Mariano, Mackey, Alan, Gordon, Ariel, Saurous, Rif A., Elidan, Gal

arXiv.org Machine LearningMar-1-2017

Modern retrieval systems are often driven by an underlying machine learning model. The goal of such systems is to identify and possibly rank the few most relevant items for a given query or context. Thus, such systems are typically evaluated using a ranking-based performance metric such as the area under the precision-recall curve, the $F_\beta$ score, precision at fixed recall, etc. Obviously, it is desirable to train such systems to optimize the metric of interest. In practice, due to the scalability limitations of existing approaches for optimizing such objectives, large-scale retrieval systems are instead trained to maximize classification accuracy, in the hope that performance as measured via the true objective will also be favorable. In this work we present a unified framework that, using straightforward building block bounds, allows for highly scalable optimization of a wide range of ranking-based objectives. We demonstrate the advantage of our approach on several real-life retrieval problems that are significantly larger than those considered in the literature, while achieving substantial improvement in performance over the accuracy-objective baseline.

artificial intelligence, machine learning, objective, (15 more...)

arXiv.org Machine Learning

1608.04802

Country: North America > United States (0.28)

Genre: Research Report (0.50)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (1.00)

Add feedback

Unachievable Region in Precision-Recall Space and Its Effect on Empirical Evaluation

Boyd, Kendrick, Costa, Vitor Santos, Davis, Jesse, Page, David

arXiv.org Artificial IntelligenceJul-18-2012

Precision-recall (PR) curves and the areas under them are widely used to summarize machine learning results, especially for data sets exhibiting class skew. They are often used analogously to ROC curves and the area under ROC curves. It is known that PR curves vary as class skew changes. What was not recognized before this paper is that there is a region of PR space that is completely unachievable, and the size of this region depends only on the skew. This paper precisely characterizes the size of that region and discusses its implications for empirical evaluation methodology in machine learning.

artificial intelligence, machine learning, unachievable region, (17 more...)

arXiv.org Artificial Intelligence

1206.4667

Country: