Scalable Teacher Forcing Network for Semi-Supervised Large Scale Data Streams
Pratama, Mahardhika, Za'in, Choiru, Lughofer, Edwin, Pardede, Eric, Rahayu, Dwi A. P.
–arXiv.org Artificial Intelligence
The large-scale data stream problem refers to high-speed information flow which cannot be processed in scalable manner under a traditional computing platform. This problem also imposes expensive labelling cost making the deployment of fully supervised algorithms unfeasible. On the other hand, the problem of semi-supervised large-scale data streams is little explored in the literature because most works are designed in the traditional single-node computing environments while also being fully supervised approaches. This paper offers Weakly Supervised Scalable Teacher Forcing Network (WeScatterNet) to cope with the scarcity of labelled samples and the large-scale data streams simultaneously. WeScatterNet is crafted under distributed computing platform of Apache Spark with a data-free model fusion strategy for model compression after parallel computing stage. It features an open network structure to address the global and local drift problems while integrating a data augmentation, annotation and auto-correction ($DA^3$) method for handling partially labelled data streams. The performance of WeScatterNet is numerically evaluated in the six large-scale data stream problems with only $25\%$ label proportions. It shows highly competitive performance even if compared with fully supervised learners with $100\%$ label proportions.
arXiv.org Artificial Intelligence
Jun-25-2021
- Country:
- Asia
- Europe
- Austria > Upper Austria
- Linz (0.04)
- Spain > Canary Islands (0.04)
- Austria > Upper Austria
- North America > United States
- Florida > Palm Beach County
- Boca Raton (0.04)
- New Jersey > Hudson County
- Hoboken (0.04)
- New York > New York County
- New York City (0.04)
- Florida > Palm Beach County
- Oceania
- Australia (0.04)
- New Zealand > North Island
- Waikato (0.04)
- Genre:
- Instructional Material (0.68)
- Research Report (0.63)
- Industry:
- Education (1.00)
- Technology: