Efficient construction of maximal and minimal representations of motifs of a string

Francois Nicolas, Veli Mäkinen, Esko Ukkonen

Research output: Contribution to journalArticleScientificpeer-review

Abstract

"Two substrings of a given text string are called synchronous (occurrence-equivalent) if their sets of occurrence locations are translates of each other. Linear time algorithms are given for the problems of finding a shortest and a longest substring that is synchronous with a given substring. We also introduce approximate variants of the motif discovery problem and give polynomial time algorithms for finding longest and shortest substrings whose suitably translated occurrence location set contains or, respectively, is contained in a given set of locations. The FFT technique used here also leads to an O(n log n) algorithm for finding the maximum-content gapped motif that is synchronous with a given set of locations; the previously known algorithm for this problem is only quadratic. (C) 2009 Elsevier B.V. All rights reserved."
Original languageEnglish
JournalTheoretical Computer Science
Volume410 (2009)
Pages (from-to)2999-3005
Number of pages7
ISSN0304-3975
DOIs
Publication statusPublished - 2009
MoE publication typeA1 Journal article-refereed

Fields of Science

  • 113 Computer and information sciences

Cite this

@article{bf24e735c9eb4bdb9c8deb8072203501,
title = "Efficient construction of maximal and minimal representations of motifs of a string",
abstract = "{"}Two substrings of a given text string are called synchronous (occurrence-equivalent) if their sets of occurrence locations are translates of each other. Linear time algorithms are given for the problems of finding a shortest and a longest substring that is synchronous with a given substring. We also introduce approximate variants of the motif discovery problem and give polynomial time algorithms for finding longest and shortest substrings whose suitably translated occurrence location set contains or, respectively, is contained in a given set of locations. The FFT technique used here also leads to an O(n log n) algorithm for finding the maximum-content gapped motif that is synchronous with a given set of locations; the previously known algorithm for this problem is only quadratic. (C) 2009 Elsevier B.V. All rights reserved.{"}",
keywords = "113 Computer and information sciences",
author = "Francois Nicolas and Veli M{\"a}kinen and Esko Ukkonen",
year = "2009",
doi = "10.1016/j.tcs.2009.03.013",
language = "English",
volume = "410 (2009)",
pages = "2999--3005",
journal = "Theoretical Computer Science",
issn = "0304-3975",
publisher = "Elsevier Scientific Publ. Co",

}

Efficient construction of maximal and minimal representations of motifs of a string. / Nicolas, Francois; Mäkinen, Veli; Ukkonen, Esko.

In: Theoretical Computer Science, Vol. 410 (2009), 2009, p. 2999-3005.

Research output: Contribution to journalArticleScientificpeer-review

TY - JOUR

T1 - Efficient construction of maximal and minimal representations of motifs of a string

AU - Nicolas, Francois

AU - Mäkinen, Veli

AU - Ukkonen, Esko

PY - 2009

Y1 - 2009

N2 - "Two substrings of a given text string are called synchronous (occurrence-equivalent) if their sets of occurrence locations are translates of each other. Linear time algorithms are given for the problems of finding a shortest and a longest substring that is synchronous with a given substring. We also introduce approximate variants of the motif discovery problem and give polynomial time algorithms for finding longest and shortest substrings whose suitably translated occurrence location set contains or, respectively, is contained in a given set of locations. The FFT technique used here also leads to an O(n log n) algorithm for finding the maximum-content gapped motif that is synchronous with a given set of locations; the previously known algorithm for this problem is only quadratic. (C) 2009 Elsevier B.V. All rights reserved."

AB - "Two substrings of a given text string are called synchronous (occurrence-equivalent) if their sets of occurrence locations are translates of each other. Linear time algorithms are given for the problems of finding a shortest and a longest substring that is synchronous with a given substring. We also introduce approximate variants of the motif discovery problem and give polynomial time algorithms for finding longest and shortest substrings whose suitably translated occurrence location set contains or, respectively, is contained in a given set of locations. The FFT technique used here also leads to an O(n log n) algorithm for finding the maximum-content gapped motif that is synchronous with a given set of locations; the previously known algorithm for this problem is only quadratic. (C) 2009 Elsevier B.V. All rights reserved."

KW - 113 Computer and information sciences

U2 - 10.1016/j.tcs.2009.03.013

DO - 10.1016/j.tcs.2009.03.013

M3 - Article

VL - 410 (2009)

SP - 2999

EP - 3005

JO - Theoretical Computer Science

JF - Theoretical Computer Science

SN - 0304-3975

ER -