Input-Adaptive Proxy for Black Carbon as a Virtual Sensor

Pak Lun Fung, Martha Zaidan, Salla Sillanpää, Anu Kousa, Jarkko V. Niemi, Hilkka Timonen, Joel Kuula, Erkka Saukko, Krista Hannele Luoma, Tuukka Petäjä, Sasu Tarkoma, Markku Kulmala, Tareq Hussein

Research output: Contribution to journalArticleScientificpeer-review


Missing data has been a challenge in air quality measurement. In this study, we develop an input-adaptive proxy, which selects input variables of other air quality variables based on their correlation coefficients with the output variable. The proxy uses ordinary least squares regression model with robust optimization and limits the input variables to a maximum of three to avoid overfitting. The adaptive proxy learns from the data set and generates the best model evaluated by adjusted coefficient of determination (adjR2). In case of missing data in the input variables, the proposed adaptive proxy then uses the second-best model until all the missing data gaps are filled up. We estimated black carbon (BC) concentration by using the input-adaptive proxy in two sites in Helsinki, which respectively represent street canyon and urban background scenario, as a case study. Accumulation mode, traffic counts, nitrogen dioxide and lung deposited surface area are found as input variables in models with the top rank. In contrast to traditional proxy, which gives 20–80% of data, the input-adaptive proxy manages to give full continuous BC estimation. The newly developed adaptive proxy also gives generally accurate BC (street canyon: adjR2 = 0.86–0.94; urban background: adjR2 = 0.74–0.91) depending on different seasons and day of the week. Due to its flexibility and reliability, the adaptive proxy can be further extend to estimate other air quality parameters. It can also act as an air quality virtual sensor in support with on-site measurements in the future.
Original languageEnglish
Article number182
Issue number1
Number of pages23
Publication statusPublished - 2020
MoE publication typeA1 Journal article-refereed

Fields of Science

  • 112 Statistics and probability
  • input-adaptive proxy
  • robust linear regression
  • virtual sensor
  • 114 Physical sciences
  • black carbon
  • air quality
  • street canyon
  • urban background

Cite this