A comprehensive evaluation of predictive performance of 33 species distribution models at species and community levels

Anna Norberg, Nerea Abrego Antia, F. Guillaume Blanchet, Frederick R. Adler, Barbara J. Anderson, Jani Anttila, Miguel B. Araújo, Tad Anthony Dallas, David Dunson, Jane Elith, Scott Foster, Richard Fox, Janet Franklin, William Godsoe, Antoine Guisan, Bob O'Hara, Nicole A. Hill, Robert D. Holt, Francis K.C Hui, Magne Husby & 15 muut John Atle Kålås, Aleksi Lehikoinen, Miska Luoto, Heidi K. Mod, Graeme Newell, Ian Renner, Tomas Valter Roslin, Janne Soininen, Wilfried Thuiller, Jarno Petteri Vanhatalo, David Warton, Matt White, Niklaus E. Zimmermann, Dominique Gravel, Otso Tapio Ovaskainen

Tutkimustuotos: ArtikkelijulkaisuArtikkeliTieteellinenvertaisarvioitu

Kuvaus

A large array of species distribution model (SDM) approaches has been developed for explaining and predicting the occurrences of individual species or species assemblages. Given the wealth of existing models, it is unclear which models perform best for interpolation or extrapolation of existing data sets, particularly when one is concerned with species assemblages. We compared the predictive performance of 33 variants of 15 widely applied and recently emerged SDMs in the context of multispecies data, including both joint SDMs that model multiple species together, and stacked SDMs that model each species individually combining the predictions afterward. We offer a comprehensive evaluation of these SDM approaches by examining their performance in predicting withheld empirical validation data of different sizes representing five different taxonomic groups, and for prediction tasks related to both interpolation and extrapolation. We measure predictive performance by 12 measures of accuracy, discrimination power, calibration, and precision of predictions, for the biological levels of species occurrence, species richness, and community composition. Our results show large variation among the models in their predictive performance, especially for communities comprising many species that are rare. The results do not reveal any major trade-offs among measures of model performance; the same models performed generally well in terms of accuracy, discrimination, and calibration, and for the biological levels of individual species, species richness, and community composition. In contrast, the models that gave the most precise predictions were not well calibrated, suggesting that poorly performing models can make overconfident predictions. However, none of the models performed well for all prediction tasks. As a general strategy, we therefore propose that researchers fit a small set of models showing complementary performance, and then apply a cross-validation procedure involving separate data to establish which of these models performs best for the goal of the study.
Alkuperäiskielienglanti
Artikkeli01370
LehtiEcological Monographs
Vuosikerta89
Numero3
Sivumäärä24
ISSN0012-9615
DOI - pysyväislinkit
TilaJulkaistu - elokuuta 2019
OKM-julkaisutyyppiA1 Alkuperäisartikkeli tieteellisessä aikakauslehdessä, vertaisarvioitu

Tieteenalat

  • 1172 Ympäristötiede

Lainaa tätä

Norberg, Anna ; Abrego Antia, Nerea ; Blanchet, F. Guillaume ; Adler, Frederick R. ; Anderson, Barbara J. ; Anttila, Jani ; Araújo, Miguel B. ; Dallas, Tad Anthony ; Dunson, David ; Elith, Jane ; Foster, Scott ; Fox, Richard ; Franklin, Janet ; Godsoe, William ; Guisan, Antoine ; O'Hara, Bob ; Hill, Nicole A. ; Holt, Robert D. ; Hui, Francis K.C ; Husby, Magne ; Kålås, John Atle ; Lehikoinen, Aleksi ; Luoto, Miska ; Mod, Heidi K. ; Newell, Graeme ; Renner, Ian ; Roslin, Tomas Valter ; Soininen, Janne ; Thuiller, Wilfried ; Vanhatalo, Jarno Petteri ; Warton, David ; White, Matt ; Zimmermann, Niklaus E. ; Gravel, Dominique ; Ovaskainen, Otso Tapio. / A comprehensive evaluation of predictive performance of 33 species distribution models at species and community levels. Julkaisussa: Ecological Monographs. 2019 ; Vuosikerta 89, Nro 3.
@article{85463af74dbc47bf85f9502fc0ca206f,
title = "A comprehensive evaluation of predictive performance of 33 species distribution models at species and community levels",
abstract = "A large array of species distribution model (SDM) approaches has been developed for explaining and predicting the occurrences of individual species or species assemblages. Given the wealth of existing models, it is unclear which models perform best for interpolation or extrapolation of existing data sets, particularly when one is concerned with species assemblages. We compared the predictive performance of 33 variants of 15 widely applied and recently emerged SDMs in the context of multispecies data, including both joint SDMs that model multiple species together, and stacked SDMs that model each species individually combining the predictions afterward. We offer a comprehensive evaluation of these SDM approaches by examining their performance in predicting withheld empirical validation data of different sizes representing five different taxonomic groups, and for prediction tasks related to both interpolation and extrapolation. We measure predictive performance by 12 measures of accuracy, discrimination power, calibration, and precision of predictions, for the biological levels of species occurrence, species richness, and community composition. Our results show large variation among the models in their predictive performance, especially for communities comprising many species that are rare. The results do not reveal any major trade-offs among measures of model performance; the same models performed generally well in terms of accuracy, discrimination, and calibration, and for the biological levels of individual species, species richness, and community composition. In contrast, the models that gave the most precise predictions were not well calibrated, suggesting that poorly performing models can make overconfident predictions. However, none of the models performed well for all prediction tasks. As a general strategy, we therefore propose that researchers fit a small set of models showing complementary performance, and then apply a cross-validation procedure involving separate data to establish which of these models performs best for the goal of the study.",
keywords = "BIOTIC INTERACTIONS, CLIMATE, GENERALIZED ADDITIVE-MODELS, IMPROVE PREDICTION, INCORPORATING SPATIAL AUTOCORRELATION, NEURAL-NETWORKS, NICHE, RANGE SHIFTS, SIMULATED DATA, STATISTICAL-MODELS, community assembly, community modeling, environmental filtering, joint species distribution model, model performance, prediction, predictive power, species interactions, stacked species distribution model, 1172 Environmental sciences",
author = "Anna Norberg and {Abrego Antia}, Nerea and Blanchet, {F. Guillaume} and Adler, {Frederick R.} and Anderson, {Barbara J.} and Jani Anttila and Ara{\'u}jo, {Miguel B.} and Dallas, {Tad Anthony} and David Dunson and Jane Elith and Scott Foster and Richard Fox and Janet Franklin and William Godsoe and Antoine Guisan and Bob O'Hara and Hill, {Nicole A.} and Holt, {Robert D.} and Hui, {Francis K.C} and Magne Husby and K{\aa}l{\aa}s, {John Atle} and Aleksi Lehikoinen and Miska Luoto and Mod, {Heidi K.} and Graeme Newell and Ian Renner and Roslin, {Tomas Valter} and Janne Soininen and Wilfried Thuiller and Vanhatalo, {Jarno Petteri} and David Warton and Matt White and Zimmermann, {Niklaus E.} and Dominique Gravel and Ovaskainen, {Otso Tapio}",
year = "2019",
month = "8",
doi = "10.1002/ecm.1370",
language = "English",
volume = "89",
journal = "Ecological Monographs",
issn = "0012-9615",
publisher = "Wiley",
number = "3",

}

Norberg, A, Abrego Antia, N, Blanchet, FG, Adler, FR, Anderson, BJ, Anttila, J, Araújo, MB, Dallas, TA, Dunson, D, Elith, J, Foster, S, Fox, R, Franklin, J, Godsoe, W, Guisan, A, O'Hara, B, Hill, NA, Holt, RD, Hui, FKC, Husby, M, Kålås, JA, Lehikoinen, A, Luoto, M, Mod, HK, Newell, G, Renner, I, Roslin, TV, Soininen, J, Thuiller, W, Vanhatalo, JP, Warton, D, White, M, Zimmermann, NE, Gravel, D & Ovaskainen, OT 2019, 'A comprehensive evaluation of predictive performance of 33 species distribution models at species and community levels', Ecological Monographs, Vuosikerta 89, Nro 3, 01370. https://doi.org/10.1002/ecm.1370

A comprehensive evaluation of predictive performance of 33 species distribution models at species and community levels. / Norberg, Anna; Abrego Antia, Nerea; Blanchet, F. Guillaume; Adler, Frederick R.; Anderson, Barbara J.; Anttila, Jani; Araújo, Miguel B.; Dallas, Tad Anthony; Dunson, David; Elith, Jane; Foster, Scott; Fox, Richard; Franklin, Janet; Godsoe, William; Guisan, Antoine; O'Hara, Bob; Hill, Nicole A. ; Holt, Robert D.; Hui, Francis K.C; Husby, Magne; Kålås, John Atle; Lehikoinen, Aleksi; Luoto, Miska; Mod, Heidi K. ; Newell, Graeme; Renner, Ian; Roslin, Tomas Valter; Soininen, Janne; Thuiller, Wilfried; Vanhatalo, Jarno Petteri; Warton, David; White, Matt; Zimmermann, Niklaus E.; Gravel, Dominique; Ovaskainen, Otso Tapio.

julkaisussa: Ecological Monographs, Vuosikerta 89, Nro 3, 01370, 08.2019.

Tutkimustuotos: ArtikkelijulkaisuArtikkeliTieteellinenvertaisarvioitu

TY - JOUR

T1 - A comprehensive evaluation of predictive performance of 33 species distribution models at species and community levels

AU - Norberg, Anna

AU - Abrego Antia, Nerea

AU - Blanchet, F. Guillaume

AU - Adler, Frederick R.

AU - Anderson, Barbara J.

AU - Anttila, Jani

AU - Araújo, Miguel B.

AU - Dallas, Tad Anthony

AU - Dunson, David

AU - Elith, Jane

AU - Foster, Scott

AU - Fox, Richard

AU - Franklin, Janet

AU - Godsoe, William

AU - Guisan, Antoine

AU - O'Hara, Bob

AU - Hill, Nicole A.

AU - Holt, Robert D.

AU - Hui, Francis K.C

AU - Husby, Magne

AU - Kålås, John Atle

AU - Lehikoinen, Aleksi

AU - Luoto, Miska

AU - Mod, Heidi K.

AU - Newell, Graeme

AU - Renner, Ian

AU - Roslin, Tomas Valter

AU - Soininen, Janne

AU - Thuiller, Wilfried

AU - Vanhatalo, Jarno Petteri

AU - Warton, David

AU - White, Matt

AU - Zimmermann, Niklaus E.

AU - Gravel, Dominique

AU - Ovaskainen, Otso Tapio

PY - 2019/8

Y1 - 2019/8

N2 - A large array of species distribution model (SDM) approaches has been developed for explaining and predicting the occurrences of individual species or species assemblages. Given the wealth of existing models, it is unclear which models perform best for interpolation or extrapolation of existing data sets, particularly when one is concerned with species assemblages. We compared the predictive performance of 33 variants of 15 widely applied and recently emerged SDMs in the context of multispecies data, including both joint SDMs that model multiple species together, and stacked SDMs that model each species individually combining the predictions afterward. We offer a comprehensive evaluation of these SDM approaches by examining their performance in predicting withheld empirical validation data of different sizes representing five different taxonomic groups, and for prediction tasks related to both interpolation and extrapolation. We measure predictive performance by 12 measures of accuracy, discrimination power, calibration, and precision of predictions, for the biological levels of species occurrence, species richness, and community composition. Our results show large variation among the models in their predictive performance, especially for communities comprising many species that are rare. The results do not reveal any major trade-offs among measures of model performance; the same models performed generally well in terms of accuracy, discrimination, and calibration, and for the biological levels of individual species, species richness, and community composition. In contrast, the models that gave the most precise predictions were not well calibrated, suggesting that poorly performing models can make overconfident predictions. However, none of the models performed well for all prediction tasks. As a general strategy, we therefore propose that researchers fit a small set of models showing complementary performance, and then apply a cross-validation procedure involving separate data to establish which of these models performs best for the goal of the study.

AB - A large array of species distribution model (SDM) approaches has been developed for explaining and predicting the occurrences of individual species or species assemblages. Given the wealth of existing models, it is unclear which models perform best for interpolation or extrapolation of existing data sets, particularly when one is concerned with species assemblages. We compared the predictive performance of 33 variants of 15 widely applied and recently emerged SDMs in the context of multispecies data, including both joint SDMs that model multiple species together, and stacked SDMs that model each species individually combining the predictions afterward. We offer a comprehensive evaluation of these SDM approaches by examining their performance in predicting withheld empirical validation data of different sizes representing five different taxonomic groups, and for prediction tasks related to both interpolation and extrapolation. We measure predictive performance by 12 measures of accuracy, discrimination power, calibration, and precision of predictions, for the biological levels of species occurrence, species richness, and community composition. Our results show large variation among the models in their predictive performance, especially for communities comprising many species that are rare. The results do not reveal any major trade-offs among measures of model performance; the same models performed generally well in terms of accuracy, discrimination, and calibration, and for the biological levels of individual species, species richness, and community composition. In contrast, the models that gave the most precise predictions were not well calibrated, suggesting that poorly performing models can make overconfident predictions. However, none of the models performed well for all prediction tasks. As a general strategy, we therefore propose that researchers fit a small set of models showing complementary performance, and then apply a cross-validation procedure involving separate data to establish which of these models performs best for the goal of the study.

KW - BIOTIC INTERACTIONS

KW - CLIMATE

KW - GENERALIZED ADDITIVE-MODELS

KW - IMPROVE PREDICTION

KW - INCORPORATING SPATIAL AUTOCORRELATION

KW - NEURAL-NETWORKS

KW - NICHE

KW - RANGE SHIFTS

KW - SIMULATED DATA

KW - STATISTICAL-MODELS

KW - community assembly

KW - community modeling

KW - environmental filtering

KW - joint species distribution model

KW - model performance

KW - prediction

KW - predictive power

KW - species interactions

KW - stacked species distribution model

KW - 1172 Environmental sciences

U2 - 10.1002/ecm.1370

DO - 10.1002/ecm.1370

M3 - Article

VL - 89

JO - Ecological Monographs

JF - Ecological Monographs

SN - 0012-9615

IS - 3

M1 - 01370

ER -