stream-learn -- open-source Python library for difficult data stream batch analysis
Ksieniewicz, Paweł, Zyblewski, Paweł
stream-learn is a Python package compatible with scikit-learn and developed for the drifting and imbalanced data stream analysis. I ts main component is a stream generator, which allows to produce a synthet ic data stream that may incorporate each of the three main concept drift typ es (i.e. The package allows conducting experiments following estab lished evaluation methodologies (i.e. In addition, estimators adapted for data stream classification have been implem ented, including both simple classifiers and state-of-art chunk-based and online classifier ensembles. To improve computational efficiency, package utili ses its own implementations of prediction metrics for imbalanced binary cla ssification tasks. Keywords: Data stream, Concept drift, Imbalanced data, Dynamic class imbalance 1. Motivation and significance Pattern recognition research increasingly goes beyond the usual pattern of building classification models on stationary data sets an d focuses on data stream processing where class distributions, and hence als o decision boundaries, may change over time [1].
Jan-29-2020
- Country:
- North America > United States
- District of Columbia > Washington (0.04)
- New York > New York County
- New York City (0.04)
- Massachusetts > Suffolk County
- Boston (0.04)
- Europe > Poland
- Lower Silesia Province > Wroclaw (0.05)
- North America > United States
- Genre:
- Research Report (0.40)
- Industry: