We describe indexes for searching large data sets for variable-length-gapped (VLG) patterns. VLG patterns are composed of two or more subpatterns, between each adjacent pair of which is a gap-constraint specifying upper and lower bounds on the distance allowed between sub-patterns. VLG patterns have numerous applications in computational biology (motif search), information retrieval (e.g., for language models, snippet generation, machine translation) and capture a useful subclass of the regular expressions commonly used in practice for searching source code. Our best approach provides search speeds several times faster than prior art across a broad range of patterns and texts.
Original languageEnglish
Number of pages12
Publication statusPublished - 17 Jan 2020
MoE publication typeNot Eligible
EventSOFSEM: 46th International Conference on Current Trends in Theory and Practice of Computer Science - Limassol, Cyprus
Duration: 21 Jan 202024 Jan 2020
Conference number: 46


Abbreviated titleSOFSEM
Internet address

Fields of Science

  • 113 Computer and information sciences
  • Pattern matching with gaps

Cite this

Cáceres, M., Puglisi, S., & Zhukova, B. (2020). Fast Indexes for Gapped Pattern Matching. 493-504. Paper presented at SOFSEM, Limassol, Cyprus. https://doi.org/10.1007/978-3-030-38919-2_40