Återgå till huvudnavigering Återgå till sök Gå direkt till huvudinnehållet

Fast Indexes for Gapped Pattern Matching

Manuel Cáceres, Simon Puglisi, Bella Zhukova

Forskningsoutput: Kapitel i bok/rapport/konferenshandlingKonferensbidragVetenskapligPeer review

Sammanfattning

We describe indexes for searching large data sets for variable-length-gapped (VLG) patterns. VLG patterns are composed of two or more subpatterns, between each adjacent pair of which is a gap-constraint specifying upper and lower bounds on the distance allowed between sub-patterns. VLG patterns have numerous applications in computational biology (motif search), information retrieval (e.g., for language models, snippet generation, machine translation) and capture a useful subclass of the regular expressions commonly used in practice for searching source code. Our best approach provides search speeds several times faster than prior art across a broad range of patterns and texts.
Originalspråkengelska
Titel på värdpublikationSOFSEM 2020: Theory and Practice of Computer Science : 46th International Conference on Current Trends in Theory and Practice of Informatics, SOFSEM 2020, Limassol, Cyprus, January 20–24, 2020, Proceedings
Antal sidor12
FörlagSpringer
Utgivningsdatum17 jan. 2020
Sidor493-504
ISBN (tryckt)978-3-030-38918-5
ISBN (elektroniskt)978-3-030-38919-2
DOI
StatusPublicerad - 17 jan. 2020
MoE-publikationstypA4 Artikel i en konferenspublikation
EvenemangSOFSEM: 46th International Conference on Current Trends in Theory and Practice of Computer Science - Limassol, Cypern
Varaktighet: 21 jan. 202024 jan. 2020
Konferensnummer: 46
http://cyprusconferences.org/sofsem2020/

Vetenskapsgrenar

  • 113 Data- och informationsvetenskap

Citera det här