HELFI: a Hebrew-Greek-Finnish Parallel Bible Corpus with Cross-Lingual Morpheme Alignment

Anssi Yli-Jyrä, Josi Purhonen, Matti Liljeqvist, Arto Antturi, Pekka Nieminen, Kari M. Räntilä, Valtteri Luoto

Forskningsoutput: Kapitel i bok/rapport/konferenshandlingKonferensbidragVetenskapligPeer review

Sammanfattning

Twenty-five years ago, morphologically aligned Hebrew-Finnish and Greek-Finnish bitexts (texts accompanied by a translation) were constructed manually in order to create an analytical concordance (Luoto et al., 1997) for a Finnish Bible translation. The creators of the bitexts recently secured the publisher's permission to release its fine-grained alignment, but the alignment was still dependent on proprietary, third-party resources such as a copyrighted text edition and proprietary morphological analyses of the source texts. In this paper, we describe a nontrivial editorial process starting from the creation of the original one-purpose database and ending with its reconstruction using only freely available text editions and annotations. This process produced an openly available dataset that contains (i) the source texts and their translations, (ii) the morphological analyses, (iii) the cross-lingual morpheme alignments.
Originalspråkengelska
Titel på gästpublikationLREC 2020, Eleventh International Conference on Language Resources and Evaluation
Antal sidor8
FörlagEuropean Language Resources Association (ELRA)
Status!!E-pub ahead of print - 16 mar 2020
MoE-publikationstypA4 Artikel i en konferenspublikation

Vetenskapsgrenar

  • 6121 Språkvetenskaper
  • 614 Teologi
  • 6122 Litteraturforskning

Citera det här

Yli-Jyrä, A., Purhonen, J., Liljeqvist, M., Antturi, A., Nieminen, P., Räntilä, K. M., & Luoto, V. (2020). HELFI: a Hebrew-Greek-Finnish Parallel Bible Corpus with Cross-Lingual Morpheme Alignment. I LREC 2020, Eleventh International Conference on Language Resources and Evaluation European Language Resources Association (ELRA).