Building and Using Existing Hunspell Dictionaries and TEX Hyphenators as Finite-State Automata

Tommi Pirinen, Krister Linden

Forskningsoutput: Kapitel i bok/rapport/konferenshandlingKonferensbidragVetenskapligPeer review

Sammanfattning

There are numerous formats for writing spellcheckers for open-source systems and there are many descriptions for languages written in these formats. Similarly, for word hyphenation by computer there are TEX rules for many languages. In this paper we demonstrate a method for converting these spell-checking lexicons and hyphenation rule sets into finite-state automata, and present a new finite-state based system for writer’s tools used in current open-source software such as Firefox, OpenOffice.org and enchant via the spell-checking library voikko.
Originalspråkengelska
Titel på gästpublikationProceedings of International Multiconference on Computer Science and Information Technology : Computational Linguistics—Applications (CLA'10 )
RedaktörerMaria Ganzha, Marcin Paprzycki
Antal sidor8
Volym5
UtgivningsortWisla, Poland
Utgivningsdatumokt 2010
Sidor477–484
ISBN (elektroniskt)978-83-60810-27-9
StatusPublicerad - okt 2010
MoE-publikationstypA4 Artikel i en konferenspublikation
EvenemangInternational Multiconference on Computer Science and Information Technology - Wisła, Polen
Varaktighet: 18 okt 201020 okt 2010
Konferensnummer: 5

Publikationsserier

NamnProceedings of the International Multiconference on Computer Science and Information Technology
FörlagPolskie Towarzystwo Informatyczne Oddział Górnoslaski
ISSN (elektroniskt)1896-7094

Vetenskapsgrenar

  • 113 Data- och informationsvetenskap
  • 612 Språk och litteratur

Citera det här