Faster entropy-bounded compressed suffix trees

Johannes Fischer, Veli Mäkinen, Gonzalo Navarro

Tutkimustuotos: ArtikkelijulkaisuArtikkeliTieteellinenvertaisarvioitu

Kuvaus

Suffix trees are among the most important data structures in stringology, with a number of applications in flourishing areas like bioinformatics. Their main problem is space usage, which has triggered much research striving for compressed representations that are still functional. A smaller suffix tree representation could fit in a faster memory, outweighing by far the theoretical slowdown brought by the space reduction. We present a novel compressed suffix tree, which is the first achieving at the same time sublogarithmic complexity for the operations, and space usage that asymptotically goes to zero as the entropy of the text does. The main ideas in our development are compressing the longest common prefix information, totally getting rid of the suffix tree topology, and expressing all the suffix tree operations using range minimum queries and a novel primitive called next/previous smaller value in a sequence. Our solutions to those operations are of independent interest. (C) 2009 Elsevier B.V. All rights reserved.
Alkuperäiskielienglanti
LehtiTheoretical Computer Science
Vuosikerta410 (2009)
Sivut5354-5364
Sivumäärä11
ISSN0304-3975
DOI - pysyväislinkit
TilaJulkaistu - 2009
OKM-julkaisutyyppiA1 Alkuperäisartikkeli tieteellisessä aikakauslehdessä, vertaisarvioitu

Tieteenalat

  • 113 Tietojenkäsittely- ja informaatiotieteet

Lainaa tätä

Fischer, Johannes ; Mäkinen, Veli ; Navarro, Gonzalo. / Faster entropy-bounded compressed suffix trees. Julkaisussa: Theoretical Computer Science. 2009 ; Vuosikerta 410 (2009). Sivut 5354-5364.
@article{f519741afb0b4feeb4799927482a63ad,
title = "Faster entropy-bounded compressed suffix trees",
abstract = "Suffix trees are among the most important data structures in stringology, with a number of applications in flourishing areas like bioinformatics. Their main problem is space usage, which has triggered much research striving for compressed representations that are still functional. A smaller suffix tree representation could fit in a faster memory, outweighing by far the theoretical slowdown brought by the space reduction. We present a novel compressed suffix tree, which is the first achieving at the same time sublogarithmic complexity for the operations, and space usage that asymptotically goes to zero as the entropy of the text does. The main ideas in our development are compressing the longest common prefix information, totally getting rid of the suffix tree topology, and expressing all the suffix tree operations using range minimum queries and a novel primitive called next/previous smaller value in a sequence. Our solutions to those operations are of independent interest. (C) 2009 Elsevier B.V. All rights reserved.",
keywords = "113 Computer and information sciences",
author = "Johannes Fischer and Veli M{\"a}kinen and Gonzalo Navarro",
year = "2009",
doi = "10.1016/j.tcs.2009.09.012",
language = "English",
volume = "410 (2009)",
pages = "5354--5364",
journal = "Theoretical Computer Science",
issn = "0304-3975",
publisher = "Elsevier Scientific Publ. Co",

}

Faster entropy-bounded compressed suffix trees. / Fischer, Johannes; Mäkinen, Veli; Navarro, Gonzalo.

julkaisussa: Theoretical Computer Science, Vuosikerta 410 (2009), 2009, s. 5354-5364.

Tutkimustuotos: ArtikkelijulkaisuArtikkeliTieteellinenvertaisarvioitu

TY - JOUR

T1 - Faster entropy-bounded compressed suffix trees

AU - Fischer, Johannes

AU - Mäkinen, Veli

AU - Navarro, Gonzalo

PY - 2009

Y1 - 2009

N2 - Suffix trees are among the most important data structures in stringology, with a number of applications in flourishing areas like bioinformatics. Their main problem is space usage, which has triggered much research striving for compressed representations that are still functional. A smaller suffix tree representation could fit in a faster memory, outweighing by far the theoretical slowdown brought by the space reduction. We present a novel compressed suffix tree, which is the first achieving at the same time sublogarithmic complexity for the operations, and space usage that asymptotically goes to zero as the entropy of the text does. The main ideas in our development are compressing the longest common prefix information, totally getting rid of the suffix tree topology, and expressing all the suffix tree operations using range minimum queries and a novel primitive called next/previous smaller value in a sequence. Our solutions to those operations are of independent interest. (C) 2009 Elsevier B.V. All rights reserved.

AB - Suffix trees are among the most important data structures in stringology, with a number of applications in flourishing areas like bioinformatics. Their main problem is space usage, which has triggered much research striving for compressed representations that are still functional. A smaller suffix tree representation could fit in a faster memory, outweighing by far the theoretical slowdown brought by the space reduction. We present a novel compressed suffix tree, which is the first achieving at the same time sublogarithmic complexity for the operations, and space usage that asymptotically goes to zero as the entropy of the text does. The main ideas in our development are compressing the longest common prefix information, totally getting rid of the suffix tree topology, and expressing all the suffix tree operations using range minimum queries and a novel primitive called next/previous smaller value in a sequence. Our solutions to those operations are of independent interest. (C) 2009 Elsevier B.V. All rights reserved.

KW - 113 Computer and information sciences

U2 - 10.1016/j.tcs.2009.09.012

DO - 10.1016/j.tcs.2009.09.012

M3 - Article

VL - 410 (2009)

SP - 5354

EP - 5364

JO - Theoretical Computer Science

JF - Theoretical Computer Science

SN - 0304-3975

ER -