Bayesian identification of bacterial strains from sequencing data

Aravind Sankar, Brandon Michael Malone, Sion C. Bayliss, Ben Pascoe, Guillaume Méric, Matthew D. Hitchings, Samuel K. Sheppard, Edward J. Feil, Jukka Ilmari Corander, Antti Juho Henrikki Honkela

Research output: Contribution to journalArticleScientificpeer-review

Abstract

Rapidly assaying the diversity of a bacterial species present in a sample obtained from a hospital patient or an environmental source has become possible after recent technological advances in DNA sequencing. For several applications it is important to accurately identify the presence and estimate relative abundances of the target organisms from short sequence reads obtained from a sample. This task is particularly challenging when the set of interest includes very closely related organisms, such as different strains of pathogenic bacteria, which can vary considerably in terms of virulence, resistance and spread. Using advanced Bayesian statistical modelling and computation techniques we introduce a novel pipeline for bacterial identification that is shown to outperform the currently leading pipeline for this purpose. Our approach enables fast and accurate sequence-based identification of bacterial strains while using only modest computational resources. Hence it provides a useful tool for a wide spectrum of applications, including rapid clinical diagnostics to distinguish among closely related strains causing nosocomial infections. The software implementation is available at https://github.com/PROBIC/BIB.
Original languageEnglish
JournalMicrobial Genomics
Volume2
Issue number8
Number of pages9
ISSN2057-5858
DOIs
Publication statusPublished - 25 Aug 2016
MoE publication typeA1 Journal article-refereed

Fields of Science

  • 113 Computer and information sciences
  • 1183 Plant biology, microbiology, virology

Cite this

Sankar, A., Malone, B. M., Bayliss, S. C., Pascoe, B., Méric, G., Hitchings, M. D., ... Honkela, A. J. H. (2016). Bayesian identification of bacterial strains from sequencing data. Microbial Genomics, 2(8). https://doi.org/10.1099/mgen.0.000075
Sankar, Aravind ; Malone, Brandon Michael ; Bayliss, Sion C. ; Pascoe, Ben ; Méric, Guillaume ; Hitchings, Matthew D. ; Sheppard, Samuel K. ; Feil, Edward J. ; Corander, Jukka Ilmari ; Honkela, Antti Juho Henrikki. / Bayesian identification of bacterial strains from sequencing data. In: Microbial Genomics. 2016 ; Vol. 2, No. 8.
@article{654ebc0b2fbb4b138cde18297386bd82,
title = "Bayesian identification of bacterial strains from sequencing data",
abstract = "Rapidly assaying the diversity of a bacterial species present in a sample obtained from a hospital patient or an environmental source has become possible after recent technological advances in DNA sequencing. For several applications it is important to accurately identify the presence and estimate relative abundances of the target organisms from short sequence reads obtained from a sample. This task is particularly challenging when the set of interest includes very closely related organisms, such as different strains of pathogenic bacteria, which can vary considerably in terms of virulence, resistance and spread. Using advanced Bayesian statistical modelling and computation techniques we introduce a novel pipeline for bacterial identification that is shown to outperform the currently leading pipeline for this purpose. Our approach enables fast and accurate sequence-based identification of bacterial strains while using only modest computational resources. Hence it provides a useful tool for a wide spectrum of applications, including rapid clinical diagnostics to distinguish among closely related strains causing nosocomial infections. The software implementation is available at https://github.com/PROBIC/BIB.",
keywords = "113 Computer and information sciences, 1183 Plant biology, microbiology, virology",
author = "Aravind Sankar and Malone, {Brandon Michael} and Bayliss, {Sion C.} and Ben Pascoe and Guillaume M{\'e}ric and Hitchings, {Matthew D.} and Sheppard, {Samuel K.} and Feil, {Edward J.} and Corander, {Jukka Ilmari} and Honkela, {Antti Juho Henrikki}",
year = "2016",
month = "8",
day = "25",
doi = "10.1099/mgen.0.000075",
language = "English",
volume = "2",
journal = "Microbial Genomics",
issn = "2057-5858",
publisher = "American Society for Microbiology",
number = "8",

}

Sankar, A, Malone, BM, Bayliss, SC, Pascoe, B, Méric, G, Hitchings, MD, Sheppard, SK, Feil, EJ, Corander, JI & Honkela, AJH 2016, 'Bayesian identification of bacterial strains from sequencing data', Microbial Genomics, vol. 2, no. 8. https://doi.org/10.1099/mgen.0.000075

Bayesian identification of bacterial strains from sequencing data. / Sankar, Aravind; Malone, Brandon Michael; Bayliss, Sion C.; Pascoe, Ben; Méric, Guillaume; Hitchings, Matthew D.; Sheppard, Samuel K.; Feil, Edward J. ; Corander, Jukka Ilmari; Honkela, Antti Juho Henrikki.

In: Microbial Genomics, Vol. 2, No. 8, 25.08.2016.

Research output: Contribution to journalArticleScientificpeer-review

TY - JOUR

T1 - Bayesian identification of bacterial strains from sequencing data

AU - Sankar, Aravind

AU - Malone, Brandon Michael

AU - Bayliss, Sion C.

AU - Pascoe, Ben

AU - Méric, Guillaume

AU - Hitchings, Matthew D.

AU - Sheppard, Samuel K.

AU - Feil, Edward J.

AU - Corander, Jukka Ilmari

AU - Honkela, Antti Juho Henrikki

PY - 2016/8/25

Y1 - 2016/8/25

N2 - Rapidly assaying the diversity of a bacterial species present in a sample obtained from a hospital patient or an environmental source has become possible after recent technological advances in DNA sequencing. For several applications it is important to accurately identify the presence and estimate relative abundances of the target organisms from short sequence reads obtained from a sample. This task is particularly challenging when the set of interest includes very closely related organisms, such as different strains of pathogenic bacteria, which can vary considerably in terms of virulence, resistance and spread. Using advanced Bayesian statistical modelling and computation techniques we introduce a novel pipeline for bacterial identification that is shown to outperform the currently leading pipeline for this purpose. Our approach enables fast and accurate sequence-based identification of bacterial strains while using only modest computational resources. Hence it provides a useful tool for a wide spectrum of applications, including rapid clinical diagnostics to distinguish among closely related strains causing nosocomial infections. The software implementation is available at https://github.com/PROBIC/BIB.

AB - Rapidly assaying the diversity of a bacterial species present in a sample obtained from a hospital patient or an environmental source has become possible after recent technological advances in DNA sequencing. For several applications it is important to accurately identify the presence and estimate relative abundances of the target organisms from short sequence reads obtained from a sample. This task is particularly challenging when the set of interest includes very closely related organisms, such as different strains of pathogenic bacteria, which can vary considerably in terms of virulence, resistance and spread. Using advanced Bayesian statistical modelling and computation techniques we introduce a novel pipeline for bacterial identification that is shown to outperform the currently leading pipeline for this purpose. Our approach enables fast and accurate sequence-based identification of bacterial strains while using only modest computational resources. Hence it provides a useful tool for a wide spectrum of applications, including rapid clinical diagnostics to distinguish among closely related strains causing nosocomial infections. The software implementation is available at https://github.com/PROBIC/BIB.

KW - 113 Computer and information sciences

KW - 1183 Plant biology, microbiology, virology

U2 - 10.1099/mgen.0.000075

DO - 10.1099/mgen.0.000075

M3 - Article

VL - 2

JO - Microbial Genomics

JF - Microbial Genomics

SN - 2057-5858

IS - 8

ER -

Sankar A, Malone BM, Bayliss SC, Pascoe B, Méric G, Hitchings MD et al. Bayesian identification of bacterial strains from sequencing data. Microbial Genomics. 2016 Aug 25;2(8). https://doi.org/10.1099/mgen.0.000075