Building and Using Existing Hunspell Dictionaries and TEX Hyphenators as Finite-State Automata

Tommi Pirinen, Krister Linden

Research output: Chapter in Book/Report/Conference proceedingConference contributionScientificpeer-review

Abstract

There are numerous formats for writing spellcheckers for open-source systems and there are many descriptions for languages written in these formats. Similarly, for word hyphenation by computer there are TEX rules for many languages. In this paper we demonstrate a method for converting these spell-checking lexicons and hyphenation rule sets into finite-state automata, and present a new finite-state based system for writer’s tools used in current open-source software such as Firefox, OpenOffice.org and enchant via the spell-checking library voikko.
Original languageEnglish
Title of host publicationProceedings of International Multiconference on Computer Science and Information Technology : Computational Linguistics—Applications (CLA'10 )
EditorsMaria Ganzha, Marcin Paprzycki
Number of pages8
Volume5
Place of PublicationWisla, Poland
Publication dateOct 2010
Pages477–484
ISBN (Electronic)978-83-60810-27-9
Publication statusPublished - Oct 2010
MoE publication typeA4 Article in conference proceedings
EventInternational Multiconference on Computer Science and Information Technology - Wisła, Poland
Duration: 18 Oct 201020 Oct 2010
Conference number: 5

Publication series

NameProceedings of the International Multiconference on Computer Science and Information Technology
PublisherPolskie Towarzystwo Informatyczne Oddział Górnoslaski
ISSN (Electronic)1896-7094

Fields of Science

  • 113 Computer and information sciences
  • 612 Languages and Literature

Cite this