Projekt per år
Sammanfattning
Current methods for word alignment require considerable amounts of
parallel text to deliver accurate results, a requirement which is met only for
a small minority of the world's approximately 7,000 languages.
We show that by jointly performing word alignment and annotation transfer in
a novel Bayesian model, alignment accuracy can be
improved for language pairs where annotations are available for only
one of the languages---a finding which could facilitate the study and
processing of a vast number of low-resource languages.
We also present an evaluation where our method is used to perform
single-source and multi-source part-of-speech transfer with 22 translations
of the same text in four different languages. This allows us to quantify the
considerable variation in accuracy depending on the specific source text(s)
used, even with different translations into the same language.
parallel text to deliver accurate results, a requirement which is met only for
a small minority of the world's approximately 7,000 languages.
We show that by jointly performing word alignment and annotation transfer in
a novel Bayesian model, alignment accuracy can be
improved for language pairs where annotations are available for only
one of the languages---a finding which could facilitate the study and
processing of a vast number of low-resource languages.
We also present an evaluation where our method is used to perform
single-source and multi-source part-of-speech transfer with 22 translations
of the same text in four different languages. This allows us to quantify the
considerable variation in accuracy depending on the specific source text(s)
used, even with different translations into the same language.
Originalspråk | engelska |
---|---|
Titel på värdpublikation | Proceedings of COLING 2016, the 26th International Conference on Computational Linguistics: Technical Papers |
Antal sidor | 10 |
Utgivningsort | Osaka, Japan |
Förlag | The Association for Computational Linguistics |
Utgivningsdatum | 1 dec. 2016 |
Sidor | 620-629 |
ISBN (elektroniskt) | 978-4-87974-702-0 |
Status | Publicerad - 1 dec. 2016 |
MoE-publikationstyp | B2 Del av bok eller annan forskningsbok |
Publikationsserier
Namn | Proceedings of COLING |
---|---|
ISSN (tryckt) | 1525-2477 |
Vetenskapsgrenar
- 6121 Språkvetenskaper
Projekt
- 1 Slutfört
-
CrossNLP: Cross-Linguistic and Multilingual Natural Language Processing with the Focus on Low-Resource Languages and Language Variants
Tiedemann, J., Östling, R. & Scherrer, Y.
01/01/2016 → 31/12/2018
Projekt: Forskningsprojekt