Combinatorial approaches for mass spectra recalibration

Sebastian Böcker, Veli Mäkinen

Research output: Contribution to journalArticleScientificpeer-review

Abstract

Mass spectrometry has become one of the most popular analysis techniques in Proteomics and Systems Biology. With the creation of larger data sets, the automated recalibration of mass spectra becomes important to ensure that every peak in the sample spectrum is correctly assigned to some peptide and protein. Algorithms for recalibrating mass spectra have to be robust with respect to wrongly assigned peaks, as well as efficient due to the amount of mass spectrometry data. The recalibration of mass spectra leads us to the problem of finding an optimal matching between mass spectra under measurement errors. We have developed two deterministic methods that allow robust computation of such a matching: The first approach uses a computational geometry interpretation of the problem and tries to find two parallel lines with constant distance that stab a maximal number of points in the plane. The second approach is based on finding a maximal common approximate subsequence and improves existing algorithms by one order of magnitude exploiting the sequential nature of the matching problem. We compare our results to a computational geometry algorithm using a topological line sweep.
Original languageEnglish
JournalIEEE/ACM Transactions on Computational Biology and Bioinformatics
Volume5
Issue number1
Pages (from-to)91-100
Number of pages10
ISSN1545-5963
DOIs
Publication statusPublished - 2008
MoE publication typeA1 Journal article-refereed

Fields of Science

  • 113 Computer and information sciences

Cite this

@article{4c71a3d30fa442edaaa329049cb48976,
title = "Combinatorial approaches for mass spectra recalibration",
abstract = "Mass spectrometry has become one of the most popular analysis techniques in Proteomics and Systems Biology. With the creation of larger data sets, the automated recalibration of mass spectra becomes important to ensure that every peak in the sample spectrum is correctly assigned to some peptide and protein. Algorithms for recalibrating mass spectra have to be robust with respect to wrongly assigned peaks, as well as efficient due to the amount of mass spectrometry data. The recalibration of mass spectra leads us to the problem of finding an optimal matching between mass spectra under measurement errors. We have developed two deterministic methods that allow robust computation of such a matching: The first approach uses a computational geometry interpretation of the problem and tries to find two parallel lines with constant distance that stab a maximal number of points in the plane. The second approach is based on finding a maximal common approximate subsequence and improves existing algorithms by one order of magnitude exploiting the sequential nature of the matching problem. We compare our results to a computational geometry algorithm using a topological line sweep.",
keywords = "113 Computer and information sciences",
author = "Sebastian B{\"o}cker and Veli M{\"a}kinen",
year = "2008",
doi = "10.1109/TCBB.2007.1077",
language = "English",
volume = "5",
pages = "91--100",
journal = "IEEE/ACM Transactions on Computational Biology and Bioinformatics",
issn = "1545-5963",
publisher = "IEEE",
number = "1",

}

Combinatorial approaches for mass spectra recalibration. / Böcker, Sebastian; Mäkinen, Veli.

In: IEEE/ACM Transactions on Computational Biology and Bioinformatics, Vol. 5, No. 1, 2008, p. 91-100.

Research output: Contribution to journalArticleScientificpeer-review

TY - JOUR

T1 - Combinatorial approaches for mass spectra recalibration

AU - Böcker, Sebastian

AU - Mäkinen, Veli

PY - 2008

Y1 - 2008

N2 - Mass spectrometry has become one of the most popular analysis techniques in Proteomics and Systems Biology. With the creation of larger data sets, the automated recalibration of mass spectra becomes important to ensure that every peak in the sample spectrum is correctly assigned to some peptide and protein. Algorithms for recalibrating mass spectra have to be robust with respect to wrongly assigned peaks, as well as efficient due to the amount of mass spectrometry data. The recalibration of mass spectra leads us to the problem of finding an optimal matching between mass spectra under measurement errors. We have developed two deterministic methods that allow robust computation of such a matching: The first approach uses a computational geometry interpretation of the problem and tries to find two parallel lines with constant distance that stab a maximal number of points in the plane. The second approach is based on finding a maximal common approximate subsequence and improves existing algorithms by one order of magnitude exploiting the sequential nature of the matching problem. We compare our results to a computational geometry algorithm using a topological line sweep.

AB - Mass spectrometry has become one of the most popular analysis techniques in Proteomics and Systems Biology. With the creation of larger data sets, the automated recalibration of mass spectra becomes important to ensure that every peak in the sample spectrum is correctly assigned to some peptide and protein. Algorithms for recalibrating mass spectra have to be robust with respect to wrongly assigned peaks, as well as efficient due to the amount of mass spectrometry data. The recalibration of mass spectra leads us to the problem of finding an optimal matching between mass spectra under measurement errors. We have developed two deterministic methods that allow robust computation of such a matching: The first approach uses a computational geometry interpretation of the problem and tries to find two parallel lines with constant distance that stab a maximal number of points in the plane. The second approach is based on finding a maximal common approximate subsequence and improves existing algorithms by one order of magnitude exploiting the sequential nature of the matching problem. We compare our results to a computational geometry algorithm using a topological line sweep.

KW - 113 Computer and information sciences

U2 - 10.1109/TCBB.2007.1077

DO - 10.1109/TCBB.2007.1077

M3 - Article

VL - 5

SP - 91

EP - 100

JO - IEEE/ACM Transactions on Computational Biology and Bioinformatics

JF - IEEE/ACM Transactions on Computational Biology and Bioinformatics

SN - 1545-5963

IS - 1

ER -