James Jianqiao Yu
余剑峤
Home Publications Services 中文

Lecturer

Department of Computer Science

University of York

CSE/139, YO10 5GH, UK

jqyu(at)ieee.org Google Scholar
Low-rank Singular Value Thresholding for Recovering Missing Air Quality Data

Authors
Yangwen Yu, James J.Q. Yu*, Victor O.K. Li, and Jacqueline C.K. Lam

Publication
Proc. IEEE International Conference on Big Data, Boston, MA, US, December 2017

Abstract
With the increasing awareness of the harmful impacts of urban air pollution, air quality monitoring stations have been deployed in many metropolitan areas. These stations provide continuous air quality monitoring data to the public. However, due to sampling device failures and data processing errors, missing data in air quality measurement is common. Data integrity becomes a critical challenge when such data are employed for public services. In this paper, we investigate the mathematical property of air quality measurements, and attempt to recover the missing data. First, we empirically study the low rank property of the measurements. Second, we formulate the low rank matrix completion optimization problem to reconstruct the missing air quality data. The problem is transformed using duality theory, and singular value thresholding (SVT) is employed to develop sub-optimal solutions. Third, to evaluate the performance of our methodology, we conduct a series of case studies including different types of missing data patterns. The simulation results demonstrate that the proposed methodology can effectively recover missing air quality data, and outperform the existing interpolation method. Finally, we investigate the parameter sensitivity of SVT. Our study can serve as a guideline for real-world missing data recovery implementation.