Abstract
Background: SARS-CoV-2 is the highly transmissible etiologic agent of coronavirus disease 2019 (COVID-19) and has become a global scientific and public health challenge since December 2019. Several new variants of SARS-CoV-2 have emerged globally raising concern about prevention and treatment of COVID-19. Early detection and in-depth analysis of the emerging variants allowing pre-emptive alert and mitigation efforts are thus of paramount importance.
Results: Here we present ClusTRace, a novel bioinformatic pipeline for a fast and scalable analysis of sequence clusters or clades in large viral phylogenies. ClusTRace offers several high-level functionalities including lineage assignment, outlier filtering, aligning, phylogenetic tree reconstruction, cluster extraction, variant calling, visualization and reporting. ClusTRace was developed as an aid for COVID-19 transmission chain tracing in Finland with the main emphasis on fast screening of phylogenies for markers of super-spreading events and other features of concern, such as high rates of cluster growth and/or accumulation of novel mutations.
Conclusions: ClusTRace provides an effective interface that can significantly cut down learning and operating costs related to complex bioinformatic analysis of large viral sequence sets and phylogenies. All code is freely available from https://bitbucket.org/plyusnin/clustrace/
Original language | English |
---|---|
Article number | 196 |
Journal | BMC Bioinformatics |
Volume | 23 |
Issue number | 1 |
Number of pages | 16 |
ISSN | 1471-2105 |
DOIs | |
Publication status | Published - 28 May 2022 |
MoE publication type | A1 Journal article-refereed |
Fields of Science
- Cluster analysis
- GENOME
- Phylogenetic analysis
- SARS-CoV-2
- TRENDS
- Variant calling
- Virus
- 3111 Biomedicine
- 1182 Biochemistry, cell and molecular biology
- 11832 Microbiology and virology