Effects of temporally external auxiliary data on model-based inference

Zhengyang Hou, Qing Xu, Ronald E. McRoberts, Jonathan A. Greenberg, Jinxiu Liu, Janne Heiskanen, Sari Pitkänen, Petteri Packalen

Research output: Contribution to journalArticleScientificpeer-review

Abstract

One of the benefits of model-based inference relative to design-based inference is that probability samples are not required which means that models can be constructed using data external to the area of interest. Although "external" usually means spatially or geographically external, it could also be used in the temporal sense that the model is constructed using data whose dates are temporally external to the dates of the data to which the model is applied. This study focuses on assessing the effects of such temporally external application data on model-based inference using remotely sensed auxiliary information. The study area was in Burkina Faso, and the variable of interest was firewood volume (m(3)/ha). A sample of 160 field plots was selected from the population and measured, and auxiliary datasets from Landsat 8 were acquired. Models were fit using weighted least squares; the population mean, mu, was estimated; and the variance of the population mean,Var((mu) over cap), was estimated using both an analytical variance estimator, )over bar>(mu) over cap (an), and an empirical bootstrap estimator, V (mu) over cap (boot). The estimates, (mu) over cap and (Var) over cap(mu) over cap, were compared for models constructed using calibration and application data of the same date and models constructed using calibration and application data whose dates differed. The primary results were twofold. First, for cases for which the dates of the model calibration and application data were the same, (mu) over cap, )over bar>(mu) over cap (an), V (mu) over cap (boot) and (Bias) over cap(mu) over cap were similar across datasets. These results suggest that the particular date of the dataset from which the calibration and application data are obtained may be mostly arbitrary assuming the relation between the dependent and independent variables does not change over time. Second, for a model for which the calibration and application data were obtained from temporally different datasets, (mu) over cap (an), V (mu) over cap (boot), and (Bias) over cap(mu) over cap were all greater than when the calibration and application data were not temporally different. Further, the criterion for screening candidate models must be based on estimation of (mu) over cap and (Var) over cap(mu) over cap rather than the model prediction accuracy or goodness of fit. The adverse effects of differing dates for the calibration and application data were exacerbated as the difference in dates increased. Finally, because the temporal differences also affected the analytical variance calculation, the bootstrapping procedure is recommended. (C) 2017 Elsevier Inc. All rights reserved.
Original languageEnglish
JournalRemote Sensing of Environment
Volume198
Pages (from-to)150-159
Number of pages10
ISSN0034-4257
DOIs
Publication statusPublished - 2017
MoE publication typeA1 Journal article-refereed

Fields of Science

  • 1172 Environmental sciences

Cite this