Unleashing Realistic Air Quality Forecasting: Introducing the Ready-to-Use PurpleAirSF Dataset
Zuo, Jingwei, Li, Wenbin, Baldo, Michele, Hacid, Hakim
–arXiv.org Artificial Intelligence
Air quality forecasting has garnered significant attention recently, with data-driven models taking center stage due to advancements in machine learning and deep learning models. However, researchers face challenges with complex data acquisition and the lack of open-sourced datasets, hindering efficient model validation. This paper introduces PurpleAirSF, a comprehensive and easily accessible dataset collected from the PurpleAir network. With its high temporal resolution, various air quality measures, and diverse geographical coverage, this dataset serves as a useful tool for researchers aiming to develop novel forecasting models, study air pollution patterns, and investigate their impacts on health and the environment. We present a detailed account of the data collection and processing methods employed to build PurpleAirSF. Furthermore, we conduct preliminary experiments using both classic and modern spatio-temporal forecasting models, thereby establishing a benchmark for future air quality forecasting tasks.
arXiv.org Artificial Intelligence
Nov-13-2023
- Country:
- North America > United States
- New York > New York County
- New York City (0.04)
- California > San Francisco County
- San Francisco (0.05)
- New York > New York County
- Europe > Germany
- Hamburg (0.05)
- Asia > China
- Beijing > Beijing (0.05)
- Tianjin Province > Tianjin (0.04)
- Guangdong Province
- North America > United States
- Genre:
- Research Report (0.50)
- Technology: