Analysing Finnish with word lists: The DDI approach to morphology revisited

Atro Tapio Voutilainen, Maria Johanna Palolahti

Research output: Chapter in Book/Report/Conference proceedingConference contributionProfessional

Abstract

Morphological lexicons for morphologically complex languages provide good text coverage at the cost of overgeneration, difficulty of modification, and sometimes performance issues. Use of simple, manageable lexicon forms – especially lists – for morphologically complex languages may appear unviable because the number of possible word-forms in a morphologically complex language can be prohibitively high. We created and experimented with a list-based lexicon for a morphologically complex language (Finnish), and compared its coverage with that of a mature morphological analyser on new text in two experimental settings. The observed smallish difference in coverage suggests the viability of using simple and easy-to-modify list-based lexicons as an initial part of morphological analysis, to increase developer control on the vast majority of input tokens.
Original languageEnglish
Title of host publicationProceedings of the 4th International Workshop for Computational Linguistics for Uralic Languages
Number of pages10
Place of PublicationStroudsburg
PublisherThe Association for Computational Linguistics
Publication date2018
Pages171-180
Publication statusPublished - 2018
MoE publication typeD3 Professional conference proceedings
EventInternational Workshop on Computational Linguistics for Uralic Languages - Helsinki, Finland
Duration: 8 Jan 20189 Jan 2018
Conference number: 4

Fields of Science

  • 6121 Languages

Cite this