Goto

Collaborating Authors

 low-cost sensor


In-field Calibration of Low-Cost Sensors through XGBoost $\&$ Aggregate Sensor Data

arXiv.org Artificial Intelligence

Effective large-scale air quality monitoring necessitates distributed sensing due to the pervasive and harmful nature of particulate matter (PM), particularly in urban environments. However, precision comes at a cost: highly accurate sensors are expensive, limiting the spatial deployments and thus their coverage. As a result, low-cost sensors have become popular, though they are prone to drift caused by environmental sensitivity and manufacturing variability. This paper presents a model for in-field sensor calibration using XGBoost ensemble learning to consolidate data from neighboring sensors. This approach reduces dependence on the presumed accuracy of individual sensors and improves generalization across different locations.


Statistical Study of Sensor Data and Investigation of ML-based Calibration Algorithms for Inexpensive Sensor Modules: Experiments from Cape Point

arXiv.org Artificial Intelligence

In this paper we present the statistical analysis of data from inexpensive sensors. We also present the performance of machine learning algorithms when used for automatic calibration such sensors. In this we have used low-cost Non-Dispersive Infrared CO$_2$ sensor placed at a co-located site at Cape Point, South Africa (maintained by Weather South Africa). The collected low-cost sensor data and site truth data are investigated and compared. We compare and investigate the performance of Random Forest Regression, Support Vector Regression, 1D Convolutional Neural Network and 1D-CNN Long Short-Term Memory Network models as a method for automatic calibration and the statistical properties of these model predictions. In addition, we also investigate the drift in performance of these algorithms with time.


SenDaL: An Effective and Efficient Calibration Framework of Low-Cost Sensors for Daily Life

arXiv.org Artificial Intelligence

The collection of accurate and noise-free data is a crucial part of Internet of Things (IoT)-controlled environments. However, the data collected from various sensors in daily life often suffer from inaccuracies. Additionally, IoT-controlled devices with low-cost sensors lack sufficient hardware resources to employ conventional deep-learning models. To overcome this limitation, we propose sensors for daily life (SenDaL), the first framework that utilizes neural networks for calibrating low cost sensors. SenDaL introduces novel training and inference processes that enable it to achieve accuracy comparable to deep learning models while simultaneously preserving latency and energy consumption similar to linear models. SenDaL is first trained in a bottom-up manner, making decisions based on calibration results from both linear and deep learning models. Once both models are trained, SenDaL makes independent decisions through a top-down inference process, ensuring accuracy and inference speed. Furthermore, SenDaL can select the optimal deep learning model according to the resources of the IoT devices because it is compatible with various deep learning models, such as long short-term memory-based and Transformer-based models. We have verified that SenDaL outperforms existing deep learning models in terms of accuracy, latency, and energy efficiency through experiments conducted in different IoT environments and real-life scenarios.


Leveraging unsupervised data and domain adaptation for deep regression in low-cost sensor calibration

arXiv.org Artificial Intelligence

Air quality monitoring is becoming an essential task with rising awareness about air quality. Low cost air quality sensors are easy to deploy but are not as reliable as the costly and bulky reference monitors. The low quality sensors can be calibrated against the reference monitors with the help of deep learning. In this paper, we translate the task of sensor calibration into a semi-supervised domain adaptation problem and propose a novel solution for the same. The problem is challenging because it is a regression problem with covariate shift and label gap. We use histogram loss instead of mean squared or mean absolute error, which is commonly used for regression, and find it useful against covariate shift. To handle the label gap, we propose weighting of samples for adversarial entropy optimization. In experimental evaluations, the proposed scheme outperforms many competitive baselines, which are based on semi-supervised and supervised domain adaptation, in terms of R2 score and mean absolute error. Ablation studies show the relevance of each proposed component in the entire scheme.


Graph Signal Reconstruction Techniques for IoT Air Pollution Monitoring Platforms

arXiv.org Artificial Intelligence

Air pollution monitoring platforms play a very important role in preventing and mitigating the effects of pollution. Recent advances in the field of graph signal processing have made it possible to describe and analyze air pollution monitoring networks using graphs. One of the main applications is the reconstruction of the measured signal in a graph using a subset of sensors. Reconstructing the signal using information from sensor neighbors can help improve the quality of network data, examples are filling in missing data with correlated neighboring nodes, or correcting a drifting sensor with neighboring sensors that are more accurate. This paper compares the use of various types of graph signal reconstruction methods applied to real data sets of Spanish air pollution reference stations. The methods considered are Laplacian interpolation, graph signal processing low-pass based graph signal reconstruction, and kernel-based graph signal reconstruction, and are compared on actual air pollution data sets measuring O3, NO2, and PM10. The ability of the methods to reconstruct the signal of a pollutant is shown, as well as the computational cost of this reconstruction. The results indicate the superiority of methods based on kernel-based graph signal reconstruction, as well as the difficulties of the methods to scale in an air pollution monitoring network with a large number of low-cost sensors. However, we show that scalability can be overcome with simple methods, such as partitioning the network using a clustering algorithm.


High-Resolution Air Quality Prediction Using Low-Cost Sensors

arXiv.org Machine Learning

The use of low-cost sensors in air quality monitoring networks is still a much-debated topic among practitioners: they are much cheaper than traditional air quality monitoring stations set up by public authorities (a few hundred dollars compared to a few dozens of thousand dollars) at the cost of a lower accuracy and robustness. This paper presents a case study of using low-cost sensors measurements in an air quality prediction engine. The engine predicts jointly PM2.5 and PM10 concentrations in the United States at a very high resolution in the range of a few dozens of meters. It is fed with the measurements provided by official air quality monitoring stations, the measurements provided by a network of more than 4000 low-cost sensors across the country, and traffic estimates. We show that the use of low-cost sensors' measurements improves the engine's accuracy very significantly. In particular, we derive a strong link between the density of low-cost sensors and the predictions' accuracy: the more low-cost sensors are in an area, the more accurate are the predictions. As an illustration, in areas with the highest density of low-cost sensors, the low-cost sensors' measurements bring a 25% and 15% improvement in PM2.5 and PM10 predictions' accuracy respectively. An other strong conclusion is that in some areas with a high density of low-cost sensors, the engine performs better when fed with low-cost sensors' measurements only than when fed with official monitoring stations' measurements only: this suggests that an air quality monitoring network composed of low-cost sensors is effective in monitoring air quality. This is a very important result, as such a monitoring network is much cheaper to set up.


A Gap Analysis of Low-Cost Outdoor Air Quality Sensor In-Field Calibration

arXiv.org Machine Learning

In recent years, interest in monitoring air quality has been growing. Traditional environmental monitoring stations are very expensive, both to acquire and to maintain, therefore their deployment is generally very sparse. This is a problem when trying to generate air quality maps with a fine spatial resolution. Given the general interest in air quality monitoring, low-cost air quality sensors have become an active area of research and development. Low-cost air quality sensors can be deployed at a finer level of granularity than traditional monitoring stations. Furthermore, they can be portable and mobile. Low-cost air quality sensors, however, present some challenges: they suffer from cross-sensitivities between different ambient pollutants; they can be affected by external factors such as traffic, weather changes, and human behavior; and their accuracy degrades over time. Some promising machine learning approaches can help us obtain highly accurate measurements with low-cost air quality sensors. In this article, we present low-cost sensor technologies, and we survey and assess machine learning-based calibration techniques for their calibration. We conclude by presenting open questions and directions for future research.