Greedy Shortest Common Superstring Approximation in Compact Space

Forskningsoutput: Kapitel i bok/rapport/konferenshandlingKonferensbidragVetenskapligPeer review


Given a set of strings, the shortest common superstring problem is to find the shortest possible string that contains all the input strings. The problem is NP-hard, but a lot of work has gone into designing approximation algorithms for solving the problem. We present the first time and space efficient implementation of the classic greedy heuristic which merges strings in decreasing order of overlap length. Our implementation works in O(n log σ) time and bits of space, where n is the total length of the input strings in characters, and σσ is the size of the alphabet. After index construction, a practical implementation of our algorithm uses roughly 5n log σ bits of space and reasonable time for a real dataset that consists of DNA fragments.
Titel på gästpublikationString Processing and Information Retrieval : 24th International Symposium, SPIRE 2017, Palermo, Italy, September 26-29, 2017, Proceedings
RedaktörerGabriele Fici, Marinella Sciortino, Rosssano Venturini
Antal sidor13
FörlagSpringer International Publishing AG
Utgivningsdatum6 sep 2017
ISBN (tryckt)978-3-319-67427-8
ISBN (elektroniskt)978-3-319-67428-5
StatusPublicerad - 6 sep 2017
MoE-publikationstypA4 Artikel i en konferenspublikation
EvenemangInternational Symposium on String Processing and Information Retrieval - Palermo, Italien
Varaktighet: 26 sep 201729 sep 2017
Konferensnummer: 24


NamnLecture Notes in Computer Science
FörlagSpringer International Publishing AG
ISSN (tryckt)0302-9743
ISSN (elektroniskt)1161-3349


  • 113 Data- och informationsvetenskap

Citera det här