Collaborating Authors

Machine Learning for a Low-cost Air Pollution Network Machine Learning

Data collection in economically constrained countries often necessitates using approximate and biased measurements due to the low-cost of the sensors used. This leads to potentially invalid predictions and poor policies or decision making. This is especially an issue if methods from resource-rich regions are applied without handling these additional constraints. In this paper we show, through the use of an air pollution network example, how using probabilistic machine learning can mitigate some of the technical constraints. Specifically we experiment with modelling the calibration for individual sensors as either distributions or Gaussian processes over time, and discuss the wider issues around the decision process.

A Gap Analysis of Low-Cost Outdoor Air Quality Sensor In-Field Calibration Machine Learning

In recent years, interest in monitoring air quality has been growing. Traditional environmental monitoring stations are very expensive, both to acquire and to maintain, therefore their deployment is generally very sparse. This is a problem when trying to generate air quality maps with a fine spatial resolution. Given the general interest in air quality monitoring, low-cost air quality sensors have become an active area of research and development. Low-cost air quality sensors can be deployed at a finer level of granularity than traditional monitoring stations. Furthermore, they can be portable and mobile. Low-cost air quality sensors, however, present some challenges: they suffer from cross-sensitivities between different ambient pollutants; they can be affected by external factors such as traffic, weather changes, and human behavior; and their accuracy degrades over time. Some promising machine learning approaches can help us obtain highly accurate measurements with low-cost air quality sensors. In this article, we present low-cost sensor technologies, and we survey and assess machine learning-based calibration techniques for their calibration. We conclude by presenting open questions and directions for future research.

Sensing the Air We Breathe — The OpenSense Zurich Dataset

AAAI Conferences

Monitoring and managing urban air pollution is a significant challenge for the sustainability of our environment. We quickly survey the air pollution modeling problem, introduce a new dataset of mobile air quality measurements in Zurich, and discuss the challenges of making sense of these data.

Multi-task Learning for Aggregated Data using Gaussian Processes Machine Learning

Aggregated data is commonplace in areas such as epidemiology and demography. For example, census data for a population is usually given as averages defined over time periods or spatial resolutions (city, region or countries). In this paper, we present a novel multi-task learning model based on Gaussian processes for joint learning of variables that have been aggregated at different input scales. Our model represents each task as the linear combination of the realizations of latent processes that are integrated at a different scale per task. We are then able to compute the cross-covariance between the different tasks either analytically or numerically. We also allow each task to have a potentially different likelihood model and provide a variational lower bound that can be optimised in a stochastic fashion making our model suitable for larger datasets. We show examples of the model in a synthetic example, a fertility dataset and an air pollution prediction application.

Adaptive machine learning strategies for network calibration of IoT smart air quality monitoring devices Machine Learning

Air Quality Multi-sensors Systems (AQMS) are IoT devices based on low cost chemical microsensors array that recently have showed capable to provide relatively accurate air pollutant quantitative estimations. Their availability permits to deploy pervasive Air Quality Monitoring (AQM) networks that will solve the geographical sparseness issue that affect the current network of AQ Regulatory Monitoring Systems (AQRMS). Unfortunately their accuracy have shown limited in long term field deployments due to negative influence of several technological issues including sensors poisoning or ageing, non target gas interference, lack of fabrication repeatability, etc. Seasonal changes in probability distribution of priors, observables and hidden context variables (i.e. non observable interferents) challenge field data driven calibration models which short to mid term performances recently rose to the attention of Urban authorithies and monitoring agencies. In this work, we address this non stationary framework with adaptive learning strategies in order to prolong the validity of multisensors calibration models enabling continuous learning. Relevant parameters influence in different network and note-to-node recalibration scenario is analyzed. Results are hence useful for pervasive deployment aimed to permanent high resolution AQ mapping in urban scenarios as well as for the use of AQMS as AQRMS backup systems providing data when AQRMS data are unavailable due to faults or scheduled mainteinance.