Input-adaptive linear mixed-effects model for estimating alveolar lung-deposited surface area (LDSA) using multipollutant datasets

Pak Lun Fung, Martha Arbayani Zaidan, Jarkko V. Niemi, Erkka Saukko, Hilkka Timonen, Anu Kousa, Joel Kuula, Topi Rönkkö, Ari Karppinen, Sasu Tarkoma, Markku Kulmala, Tuukka Petäjä, Tareq Hussein

Forskningsoutput: TidskriftsbidragArtikelVetenskapligPeer review


Lung-deposited surface area (LDSA) has been considered to be a better metric to explain nanoparticle toxicity instead of the commonly used particulate mass concentration. LDSA concentrations can be obtained either by direct measurements or by calculation based on the empirical lung deposition model and measurements of particle size distribution. However, the LDSA or size distribution measurements are neither compulsory nor regulated by the government. As a result, LDSA data are often scarce spatially and temporally. In light of this, we developed a novel statistical model, named the input-adaptive mixed-effects (IAME) model, to estimate LDSA based on other already existing measurements of air pollutant variables and meteorological conditions. During the measurement period in 2017–2018, we retrieved LDSA data measured by Pegasor AQ Urban and other variables at a street canyon (SC, average LDSA = 19.7 ± 11.3 µm2 cm−3) site and an urban background (UB, average LDSA = 11.2 ± 7.1 µm2 cm−3) site in Helsinki, Finland. For the continuous estimation of LDSA, the IAME model was automatised to select the best combination of input variables, including a maximum of three fixed effect variables and three time indictors as random effect variables. Altogether, 696 submodels were generated and ranked by the coefficient of determination (R2), mean absolute error (MAE) and centred root-mean-square difference (cRMSD) in order. At the SC site, the LDSA concentrations were best estimated by mass concentration of particle of diameters smaller than 2.5 µm (PM2.5), total particle number concentration (PNC) and black carbon (BC), all of which are closely connected with the vehicular emissions. At the UB site, the LDSA concentrations were found to be correlated with PM2.5, BC and carbon monoxide (CO). The accuracy of the overall model was better at the SC site (R2=0.80, MAE = 3.7 µm2 cm−3) than at the UB site (R2=0.77, MAE = 2.3 µm2 cm−3), plausibly because the LDSA source was more tightly controlled by the close-by vehicular emission source. The results also demonstrated that the additional adjustment by taking random effects into account improved the sensitivity and the accuracy of the fixed effect model. Due to its adaptive input selection and inclusion of random effects, IAME could fill up missing data or even serve as a network of virtual sensors to complement the measurements at reference stations.
TidskriftAtmospheric Chemistry and Physics
Sidor (från-till)1861-1882
Antal sidor22
StatusPublicerad - 8 feb. 2022
MoE-publikationstypA1 Tidskriftsartikel-refererad


  • 1172 Miljövetenskap
  • 113 Data- och informationsvetenskap

Citera det här