Veli: Unsupervised Method and Unified Benchmark for Low-Cost Air Quality Sensor Correction

📅 2025-08-01

📈 Citations: 0

✨ Influential: 0

career value

173K/year

🤖 AI Summary

Low-cost air quality sensors (LCSs) suffer from signal drift, calibration errors, and environmental interference, resulting in insufficient accuracy and heavy reliance on co-located reference instruments for calibration. To address this, we propose Veli—the first unsupervised correction framework for LCSs that requires no reference sensors. Veli employs variational inference to construct a latent variable model that explicitly disentangles the underlying pollution signal from heterogeneous sensor-specific noise sources. Furthermore, we introduce AQ-SDR, the largest publicly available standardized dataset for LCS correction to date, spanning multiple cities and seasons, and establish a unified benchmark for evaluation. Extensive experiments demonstrate that Veli significantly improves LCS accuracy both in-distribution and out-of-distribution, effectively mitigating drift and anomalous sensor responses. All code, trained models, and the AQ-SDR dataset will be made publicly available.

Technology Category

Application Category

📝 Abstract

Urban air pollution is a major health crisis causing millions of premature deaths annually, underscoring the urgent need for accurate and scalable monitoring of air quality (AQ). While low-cost sensors (LCS) offer a scalable alternative to expensive reference-grade stations, their readings are affected by drift, calibration errors, and environmental interference. To address these challenges, we introduce Veli (Reference-free Variational Estimation via Latent Inference), an unsupervised Bayesian model that leverages variational inference to correct LCS readings without requiring co-location with reference stations, eliminating a major deployment barrier. Specifically, Veli constructs a disentangled representation of the LCS readings, effectively separating the true pollutant reading from the sensor noise. To build our model and address the lack of standardized benchmarks in AQ monitoring, we also introduce the Air Quality Sensor Data Repository (AQ-SDR). AQ-SDR is the largest AQ sensor benchmark to date, with readings from 23,737 LCS and reference stations across multiple regions. Veli demonstrates strong generalization across both in-distribution and out-of-distribution settings, effectively handling sensor drift and erratic sensor behavior. Code for model and dataset will be made public when this paper is published.

Problem

Research questions and friction points this paper is trying to address.

Corrects low-cost air sensor errors without reference stations

Separates true pollutant readings from sensor noise

Provides largest standardized benchmark for air quality sensors

Innovation

Methods, ideas, or system contributions that make the work stand out.

Unsupervised Bayesian model for sensor correction

Variational inference without reference stations

Disentangled representation separates noise from data

🔎 Similar Papers

No similar papers found.