Abstract

We describe indexes for searching large data sets for variable-length-gapped (VLG) patterns. VLG patterns are composed of two or more subpatterns, between each adjacent pair of which is a gap-constraint specifying upper and lower bounds on the distance allowed between sub-patterns. VLG patterns have numerous applications in computational biology (motif search), information retrieval (e.g., for language models, snippet generation, machine translation) and capture a useful subclass of the regular expressions commonly used in practice for searching source code. Our best approach provides search speeds several times faster than prior art across a broad range of patterns and texts.
Original languageEnglish
Pages493-504
Number of pages12
DOIs
Publication statusPublished - 17 Jan 2020
MoE publication typeNot Eligible
EventSOFSEM: 46th International Conference on Current Trends in Theory and Practice of Computer Science - Limassol, Cyprus
Duration: 21 Jan 202024 Jan 2020
Conference number: 46
http://cyprusconferences.org/sofsem2020/

Conference

ConferenceSOFSEM
Abbreviated titleSOFSEM
CountryCyprus
CityLimassol
Period21/01/202024/01/2020
Internet address

Fields of Science

  • 113 Computer and information sciences
  • Pattern matching with gaps

Cite this

Cáceres, M., Puglisi, S., & Zhukova, B. (2020). Fast Indexes for Gapped Pattern Matching. 493-504. Paper presented at SOFSEM, Limassol, Cyprus. https://doi.org/10.1007/978-3-030-38919-2_40