Fast Indexes for Gapped Pattern Matching

Manuel Cáceres, Simon Puglisi, Bella Zhukova

Forskningsoutput: KonferensbidragKonferenspapperPeer review

Sammanfattning

We describe indexes for searching large data sets for variable-length-gapped (VLG) patterns. VLG patterns are composed of two or more subpatterns, between each adjacent pair of which is a gap-constraint specifying upper and lower bounds on the distance allowed between sub-patterns. VLG patterns have numerous applications in computational biology (motif search), information retrieval (e.g., for language models, snippet generation, machine translation) and capture a useful subclass of the regular expressions commonly used in practice for searching source code. Our best approach provides search speeds several times faster than prior art across a broad range of patterns and texts.
Originalspråkengelska
Sidor493-504
Antal sidor12
DOI
StatusPublicerad - 17 jan 2020
MoE-publikationstypEj behörig
EvenemangSOFSEM: 46th International Conference on Current Trends in Theory and Practice of Computer Science - Limassol, Cypern
Varaktighet: 21 jan 202024 jan 2020
Konferensnummer: 46
http://cyprusconferences.org/sofsem2020/

Konferens

KonferensSOFSEM
Förkortad titelSOFSEM
LandCypern
OrtLimassol
Period21/01/202024/01/2020
Internetadress

Vetenskapsgrenar

  • 113 Data- och informationsvetenskap

Citera det här