Neural Relation Graph: A Unified Framework for Identifying Label Noise and Outlier Data

Jan-19-2025, 13:37:09 GMT–Neural Information Processing Systems

Diagnosing and cleaning data is a crucial step for building robust machine learning systems. However, identifying problems within large-scale datasets with real-world distributions is challenging due to the presence of complex issues such as label errors, under-representation, and outliers. In this paper, we propose a unified approach for identifying the problematic data by utilizing a largely ignored source of information: a relational structure of data in the feature-embedded space. To this end, we present scalable and effective algorithms for detecting label errors and outlier data based on the relational graph structure of data. We further introduce a visualization tool that provides contextual information of a data point in the feature-embedded space, serving as an effective tool for interactively diagnosing data.

identifying label noise, label noise and outlier data, neural relation graph, (6 more...)

Neural Information Processing Systems

Jan-19-2025, 13:37:09 GMT

Conferences Web Page

Add feedback

Technology:
- Information Technology > Artificial Intelligence > Machine Learning (1.00)