Histogram approaches for imbalanced data streams regression
Aminian, Ehsan, Gama, Joao, Ribeiro, Rita P.
–arXiv.org Artificial Intelligence
Handling imbalanced data streams in regression tasks presents a significant challenge, as rare instances can appear anywhere in the target distribution rather than being confined to its extreme values. In this paper, we introduce novel data-level sampling strategies, \texttt{HistUS} and \texttt{HistOS}, that utilize histogram-based approaches to dynamically balance data streams. Unlike previous methods based on Chebyshev\textquotesingle s inequality, our proposed techniques identify and handle rare cases across the entire distribution effectively. We demonstrate that \texttt{HistUS} and \texttt{HistOS} outperform traditional methods through extensive experiments on synthetic and real-world datasets, leading to more accurate and robust regression models in streaming environments.
arXiv.org Artificial Intelligence
Jan-29-2025
- Country:
- North America > United States
- California (0.04)
- Massachusetts > Middlesex County
- Cambridge (0.04)
- Europe > Portugal
- North America > United States
- Genre:
- Research Report > New Finding (0.46)
- Technology: