Sammanfattning
We describe indexes for searching large data sets for variable-length-gapped (VLG) patterns. VLG patterns are composed of two or more subpatterns, between each adjacent pair of which is a gap-constraint specifying upper and lower bounds on the distance allowed between sub-patterns. VLG patterns have numerous applications in computational biology (motif search), information retrieval (e.g., for language models, snippet generation, machine translation) and capture a useful subclass of the regular expressions commonly used in practice for searching source code. Our best approach provides search speeds several times faster than prior art across a broad range of patterns and texts.
| Originalspråk | engelska |
|---|---|
| Titel på värdpublikation | SOFSEM 2020: Theory and Practice of Computer Science : 46th International Conference on Current Trends in Theory and Practice of Informatics, SOFSEM 2020, Limassol, Cyprus, January 20–24, 2020, Proceedings |
| Antal sidor | 12 |
| Förlag | Springer |
| Utgivningsdatum | 17 jan. 2020 |
| Sidor | 493-504 |
| ISBN (tryckt) | 978-3-030-38918-5 |
| ISBN (elektroniskt) | 978-3-030-38919-2 |
| DOI | |
| Status | Publicerad - 17 jan. 2020 |
| MoE-publikationstyp | A4 Artikel i en konferenspublikation |
| Evenemang | SOFSEM: 46th International Conference on Current Trends in Theory and Practice of Computer Science - Limassol, Cypern Varaktighet: 21 jan. 2020 → 24 jan. 2020 Konferensnummer: 46 http://cyprusconferences.org/sofsem2020/ |
Vetenskapsgrenar
- 113 Data- och informationsvetenskap
Citera det här
- APA
- Author
- BIBTEX
- Harvard
- Standard
- RIS
- Vancouver