Evaluating complex relationships between ecological indicators and environmental factors in the Baltic Sea

A machine learning approach

Annukka Maaria Lehikoinen, Jens Olsson, Lena Bergström, Ulf Bergstrom, Andreas Bryhn, Ronny Fredriksson, Laura Uusitalo

Tutkimustuotos: ArtikkelijulkaisuArtikkeliTieteellinenvertaisarvioitu

Kuvaus

The state of marine ecosystems is increasingly evaluated using indicators. The indicator assessment results need to be understood in the context of the whole ecosystem in order to understand the key factors determining the status of these environmental components. Data available from the system’s different components are, however, often heterogeneous: they may represent different spatial and temporal scales, and different parameters can be measured with different accuracy. This makes it difficult to evaluate the relationship between these variables and status of the environment using indicators. We studied whether probabilistic, machine learning-based classifiers could provide for assessing the relationships between multiple environmental factors and ecological indicators. This paper demonstrates the use of Bayesian network classifiers (Tree-augmented Naive Bayes classifier, TAN as the specific case example), used together with structural learning from data and Entropy
Minimization Discretization (IEMD) algorithm to study environment-indicator relationships within coastal fish communities in the Baltic Sea. By using two Baltic-wide indicators of coastal fish community status and a heterogeneous set of potentially influential natural and anthropogenic variables, we explore and discuss the potential of the approach. Given pre-defined cutting points for the indicators, such as the classification thresholds of the indicator, the method enables identifying relevant variables and estimating their relative importance. This information could be used in environmental management to demonstrate at which threshold value the state of an indicator is likely to respond to a pressure or a combination of pressures. In contrast to many other multivariate statistical methodologies, the presented approach can handle missing data as well as data of varying types, from fully quantitative to presence-absence, in the same analysis.
Alkuperäiskielienglanti
LehtiEcological Indicators
Vuosikerta101
Sivut117-125
Sivumäärä9
ISSN1470-160X
DOI - pysyväislinkit
TilaJulkaistu - kesäkuuta 2019
OKM-julkaisutyyppiA1 Alkuperäisartikkeli tieteellisessä aikakauslehdessä, vertaisarvioitu

Tieteenalat

  • 1172 Ympäristötiede

Lainaa tätä

Lehikoinen, Annukka Maaria ; Olsson, Jens ; Bergström, Lena ; Bergstrom, Ulf ; Bryhn, Andreas ; Fredriksson, Ronny ; Uusitalo, Laura. / Evaluating complex relationships between ecological indicators and environmental factors in the Baltic Sea : A machine learning approach. Julkaisussa: Ecological Indicators. 2019 ; Vuosikerta 101. Sivut 117-125.
@article{38bb046571244bcf877a34177cb80a6b,
title = "Evaluating complex relationships between ecological indicators and environmental factors in the Baltic Sea: A machine learning approach",
abstract = "The state of marine ecosystems is increasingly evaluated using indicators. The indicator assessment results need to be understood in the context of the whole ecosystem in order to understand the key factors determining the status of these environmental components. Data available from the system’s different components are, however, often heterogeneous: they may represent different spatial and temporal scales, and different parameters can be measured with different accuracy. This makes it difficult to evaluate the relationship between these variables and status of the environment using indicators. We studied whether probabilistic, machine learning-based classifiers could provide for assessing the relationships between multiple environmental factors and ecological indicators. This paper demonstrates the use of Bayesian network classifiers (Tree-augmented Naive Bayes classifier, TAN as the specific case example), used together with structural learning from data and Entropy Minimization Discretization (IEMD) algorithm to study environment-indicator relationships within coastal fish communities in the Baltic Sea. By using two Baltic-wide indicators of coastal fish community status and a heterogeneous set of potentially influential natural and anthropogenic variables, we explore and discuss the potential of the approach. Given pre-defined cutting points for the indicators, such as the classification thresholds of the indicator, the method enables identifying relevant variables and estimating their relative importance. This information could be used in environmental management to demonstrate at which threshold value the state of an indicator is likely to respond to a pressure or a combination of pressures. In contrast to many other multivariate statistical methodologies, the presented approach can handle missing data as well as data of varying types, from fully quantitative to presence-absence, in the same analysis.",
keywords = "1172 Environmental sciences, Bayesian network classifiers, Tree-augmented Naive Bayes, Entropy Minimization Discretization, Coastal fish communities, Baltic Sea, COASTAL FISH INDICATORS, NAIVE BAYES, FRAMEWORK, EUTROPHICATION",
author = "Lehikoinen, {Annukka Maaria} and Jens Olsson and Lena Bergstr{\"o}m and Ulf Bergstrom and Andreas Bryhn and Ronny Fredriksson and Laura Uusitalo",
year = "2019",
month = "6",
doi = "10.1016/j.ecolind.2018.12.053",
language = "English",
volume = "101",
pages = "117--125",
journal = "Ecological Indicators",
issn = "1470-160X",
publisher = "Elsevier Scientific Publ. Co",

}

Evaluating complex relationships between ecological indicators and environmental factors in the Baltic Sea : A machine learning approach. / Lehikoinen, Annukka Maaria; Olsson, Jens; Bergström, Lena; Bergstrom, Ulf; Bryhn, Andreas; Fredriksson, Ronny; Uusitalo, Laura.

julkaisussa: Ecological Indicators, Vuosikerta 101, 06.2019, s. 117-125.

Tutkimustuotos: ArtikkelijulkaisuArtikkeliTieteellinenvertaisarvioitu

TY - JOUR

T1 - Evaluating complex relationships between ecological indicators and environmental factors in the Baltic Sea

T2 - A machine learning approach

AU - Lehikoinen, Annukka Maaria

AU - Olsson, Jens

AU - Bergström, Lena

AU - Bergstrom, Ulf

AU - Bryhn, Andreas

AU - Fredriksson, Ronny

AU - Uusitalo, Laura

PY - 2019/6

Y1 - 2019/6

N2 - The state of marine ecosystems is increasingly evaluated using indicators. The indicator assessment results need to be understood in the context of the whole ecosystem in order to understand the key factors determining the status of these environmental components. Data available from the system’s different components are, however, often heterogeneous: they may represent different spatial and temporal scales, and different parameters can be measured with different accuracy. This makes it difficult to evaluate the relationship between these variables and status of the environment using indicators. We studied whether probabilistic, machine learning-based classifiers could provide for assessing the relationships between multiple environmental factors and ecological indicators. This paper demonstrates the use of Bayesian network classifiers (Tree-augmented Naive Bayes classifier, TAN as the specific case example), used together with structural learning from data and Entropy Minimization Discretization (IEMD) algorithm to study environment-indicator relationships within coastal fish communities in the Baltic Sea. By using two Baltic-wide indicators of coastal fish community status and a heterogeneous set of potentially influential natural and anthropogenic variables, we explore and discuss the potential of the approach. Given pre-defined cutting points for the indicators, such as the classification thresholds of the indicator, the method enables identifying relevant variables and estimating their relative importance. This information could be used in environmental management to demonstrate at which threshold value the state of an indicator is likely to respond to a pressure or a combination of pressures. In contrast to many other multivariate statistical methodologies, the presented approach can handle missing data as well as data of varying types, from fully quantitative to presence-absence, in the same analysis.

AB - The state of marine ecosystems is increasingly evaluated using indicators. The indicator assessment results need to be understood in the context of the whole ecosystem in order to understand the key factors determining the status of these environmental components. Data available from the system’s different components are, however, often heterogeneous: they may represent different spatial and temporal scales, and different parameters can be measured with different accuracy. This makes it difficult to evaluate the relationship between these variables and status of the environment using indicators. We studied whether probabilistic, machine learning-based classifiers could provide for assessing the relationships between multiple environmental factors and ecological indicators. This paper demonstrates the use of Bayesian network classifiers (Tree-augmented Naive Bayes classifier, TAN as the specific case example), used together with structural learning from data and Entropy Minimization Discretization (IEMD) algorithm to study environment-indicator relationships within coastal fish communities in the Baltic Sea. By using two Baltic-wide indicators of coastal fish community status and a heterogeneous set of potentially influential natural and anthropogenic variables, we explore and discuss the potential of the approach. Given pre-defined cutting points for the indicators, such as the classification thresholds of the indicator, the method enables identifying relevant variables and estimating their relative importance. This information could be used in environmental management to demonstrate at which threshold value the state of an indicator is likely to respond to a pressure or a combination of pressures. In contrast to many other multivariate statistical methodologies, the presented approach can handle missing data as well as data of varying types, from fully quantitative to presence-absence, in the same analysis.

KW - 1172 Environmental sciences

KW - Bayesian network classifiers

KW - Tree-augmented Naive Bayes

KW - Entropy Minimization Discretization

KW - Coastal fish communities

KW - Baltic Sea

KW - COASTAL FISH INDICATORS

KW - NAIVE BAYES

KW - FRAMEWORK

KW - EUTROPHICATION

U2 - 10.1016/j.ecolind.2018.12.053

DO - 10.1016/j.ecolind.2018.12.053

M3 - Article

VL - 101

SP - 117

EP - 125

JO - Ecological Indicators

JF - Ecological Indicators

SN - 1470-160X

ER -