Computational framework for targeted high-coverage sequencing based NIPT

Hindrek Teder, Priit Paluoja, Andres Salumets, Kaarel Krjutškov, Priit Palta

Tutkimustuotos: ArtikkelijulkaisuArtikkeliTieteellinen

Kuvaus

Non-invasive prenatal testing (NIPT) enables accurate detection of fetal chromosomal trisomies. The majority of existing computational methods for sequencing-based NIPT analyses rely on low-coverage whole-genome sequencing (WGS) data and are not applicable for targeted high-coverage sequencing data from cell-free DNA samples. Here, we present a novel computational framework for a targeted high-coverage sequencing-based NIPT analysis. The developed methods use a hidden Markov model (HMM)-based approach in conjunction with supplemental machine learning methods, such as decision tree (DT) and support vector machine (SVM), to detect fetal trisomy and parental origin of additional fetal chromosomes. These methods were tested with simulated datasets covering a wide range of biologically relevant scenarios with various chromosomal quantities, parental origins of extra chromosomes, fetal DNA fractions and sequencing read depths. Consequently, we determined the functional feasibility and limitations of each proposed approach and demonstrated that read count-based HMM achieved the best overall classification accuracy of 0.89 for detecting fetal euploidies and trisomies. Furthermore, we show that by using the DT and SVM methods on the HMM state classification results, it was possible to increase the final trisomy classification accuracy to 0.98 and 0.99, respectively. We demonstrated that read count and allelic ratio-based models can achieve a high accuracy (up to 0.98) for detecting fetal trisomy even if the fetal fraction is as low as 2%. Currently existing methods require at least 4% fetal fraction, which can be an issue in the case of early gestational age (<10 weeks) or elevated maternal body mass index (>35 kg/m2). More accurate detection can be achieved at higher sequencing depth using HMM in conjunction with supplemental methods, which significantly improve the trisomy detection especially in borderline scenarios (e.g., very low fetal fraction) and can enable to perform NIPT even earlier than 10 weeks of pregnancy.
Alkuperäiskielienglanti
LehtibioRxiv : the preprint server for biology
DOI - pysyväislinkit
TilaJätetty - 1 tammikuuta 2018
OKM-julkaisutyyppiB1 Kirjoitus tieteellisessä aikakauslehdessä

Lainaa tätä

@article{e6aa5db1898b47469ca250041f687833,
title = "Computational framework for targeted high-coverage sequencing based NIPT",
abstract = "Non-invasive prenatal testing (NIPT) enables accurate detection of fetal chromosomal trisomies. The majority of existing computational methods for sequencing-based NIPT analyses rely on low-coverage whole-genome sequencing (WGS) data and are not applicable for targeted high-coverage sequencing data from cell-free DNA samples. Here, we present a novel computational framework for a targeted high-coverage sequencing-based NIPT analysis. The developed methods use a hidden Markov model (HMM)-based approach in conjunction with supplemental machine learning methods, such as decision tree (DT) and support vector machine (SVM), to detect fetal trisomy and parental origin of additional fetal chromosomes. These methods were tested with simulated datasets covering a wide range of biologically relevant scenarios with various chromosomal quantities, parental origins of extra chromosomes, fetal DNA fractions and sequencing read depths. Consequently, we determined the functional feasibility and limitations of each proposed approach and demonstrated that read count-based HMM achieved the best overall classification accuracy of 0.89 for detecting fetal euploidies and trisomies. Furthermore, we show that by using the DT and SVM methods on the HMM state classification results, it was possible to increase the final trisomy classification accuracy to 0.98 and 0.99, respectively. We demonstrated that read count and allelic ratio-based models can achieve a high accuracy (up to 0.98) for detecting fetal trisomy even if the fetal fraction is as low as 2{\%}. Currently existing methods require at least 4{\%} fetal fraction, which can be an issue in the case of early gestational age (<10 weeks) or elevated maternal body mass index (>35 kg/m2). More accurate detection can be achieved at higher sequencing depth using HMM in conjunction with supplemental methods, which significantly improve the trisomy detection especially in borderline scenarios (e.g., very low fetal fraction) and can enable to perform NIPT even earlier than 10 weeks of pregnancy.",
author = "Hindrek Teder and Priit Paluoja and Andres Salumets and Kaarel Krjutškov and Priit Palta",
year = "2018",
month = "1",
day = "1",
doi = "10.1101/486282",
language = "English",
journal = "bioRxiv : the preprint server for biology",
publisher = "Cold Spring Harbor Laboratory",

}

Computational framework for targeted high-coverage sequencing based NIPT. / Teder, Hindrek; Paluoja, Priit; Salumets, Andres; Krjutškov, Kaarel; Palta, Priit.

julkaisussa: bioRxiv : the preprint server for biology , 01.01.2018.

Tutkimustuotos: ArtikkelijulkaisuArtikkeliTieteellinen

TY - JOUR

T1 - Computational framework for targeted high-coverage sequencing based NIPT

AU - Teder, Hindrek

AU - Paluoja, Priit

AU - Salumets, Andres

AU - Krjutškov, Kaarel

AU - Palta, Priit

PY - 2018/1/1

Y1 - 2018/1/1

N2 - Non-invasive prenatal testing (NIPT) enables accurate detection of fetal chromosomal trisomies. The majority of existing computational methods for sequencing-based NIPT analyses rely on low-coverage whole-genome sequencing (WGS) data and are not applicable for targeted high-coverage sequencing data from cell-free DNA samples. Here, we present a novel computational framework for a targeted high-coverage sequencing-based NIPT analysis. The developed methods use a hidden Markov model (HMM)-based approach in conjunction with supplemental machine learning methods, such as decision tree (DT) and support vector machine (SVM), to detect fetal trisomy and parental origin of additional fetal chromosomes. These methods were tested with simulated datasets covering a wide range of biologically relevant scenarios with various chromosomal quantities, parental origins of extra chromosomes, fetal DNA fractions and sequencing read depths. Consequently, we determined the functional feasibility and limitations of each proposed approach and demonstrated that read count-based HMM achieved the best overall classification accuracy of 0.89 for detecting fetal euploidies and trisomies. Furthermore, we show that by using the DT and SVM methods on the HMM state classification results, it was possible to increase the final trisomy classification accuracy to 0.98 and 0.99, respectively. We demonstrated that read count and allelic ratio-based models can achieve a high accuracy (up to 0.98) for detecting fetal trisomy even if the fetal fraction is as low as 2%. Currently existing methods require at least 4% fetal fraction, which can be an issue in the case of early gestational age (<10 weeks) or elevated maternal body mass index (>35 kg/m2). More accurate detection can be achieved at higher sequencing depth using HMM in conjunction with supplemental methods, which significantly improve the trisomy detection especially in borderline scenarios (e.g., very low fetal fraction) and can enable to perform NIPT even earlier than 10 weeks of pregnancy.

AB - Non-invasive prenatal testing (NIPT) enables accurate detection of fetal chromosomal trisomies. The majority of existing computational methods for sequencing-based NIPT analyses rely on low-coverage whole-genome sequencing (WGS) data and are not applicable for targeted high-coverage sequencing data from cell-free DNA samples. Here, we present a novel computational framework for a targeted high-coverage sequencing-based NIPT analysis. The developed methods use a hidden Markov model (HMM)-based approach in conjunction with supplemental machine learning methods, such as decision tree (DT) and support vector machine (SVM), to detect fetal trisomy and parental origin of additional fetal chromosomes. These methods were tested with simulated datasets covering a wide range of biologically relevant scenarios with various chromosomal quantities, parental origins of extra chromosomes, fetal DNA fractions and sequencing read depths. Consequently, we determined the functional feasibility and limitations of each proposed approach and demonstrated that read count-based HMM achieved the best overall classification accuracy of 0.89 for detecting fetal euploidies and trisomies. Furthermore, we show that by using the DT and SVM methods on the HMM state classification results, it was possible to increase the final trisomy classification accuracy to 0.98 and 0.99, respectively. We demonstrated that read count and allelic ratio-based models can achieve a high accuracy (up to 0.98) for detecting fetal trisomy even if the fetal fraction is as low as 2%. Currently existing methods require at least 4% fetal fraction, which can be an issue in the case of early gestational age (<10 weeks) or elevated maternal body mass index (>35 kg/m2). More accurate detection can be achieved at higher sequencing depth using HMM in conjunction with supplemental methods, which significantly improve the trisomy detection especially in borderline scenarios (e.g., very low fetal fraction) and can enable to perform NIPT even earlier than 10 weeks of pregnancy.

U2 - 10.1101/486282

DO - 10.1101/486282

M3 - Article

JO - bioRxiv : the preprint server for biology

JF - bioRxiv : the preprint server for biology

ER -