HELFI: a Hebrew-Greek-Finnish Parallel Bible Corpus with Cross-Lingual Morpheme Alignment

Anssi Yli-Jyrä, Josi Purhonen, Matti Liljeqvist, Arto Antturi, Pekka Nieminen, Kari M. Räntilä, Valtter Luoto

Research output: Chapter in Book/Report/Conference proceedingConference contributionScientificpeer-review

Abstract

Twenty-five years ago, morphologically aligned Hebrew-Finnish and Greek-Finnish bitexts (texts accompanied by a translation) were constructed manually in order to create an analytical concordance (Luoto et al., 1997) for a Finnish Bible translation. The creators of the bitexts recently secured the publisher's permission to release its fine-grained alignment, but the alignment was still dependent on proprietary, third-party resources such as a copyrighted text edition and proprietary morphological analyses of the source texts. In this paper, we describe a nontrivial editorial process starting from the creation of the original one-purpose database and ending with its reconstruction using only freely available text editions and annotations. This process produced an openly available dataset that contains (i) the source texts and their translations, (ii) the morphological analyses, (iii) the cross-lingual morpheme alignments.
Original languageEnglish
Title of host publicationProceedings of the 12th Language Resources and Evaluation Conference
EditorsNicoletta Calzolari [et al.]
Number of pages8
Place of PublicationParis
PublisherEuropean Language Resources Association (ELRA)
Publication dateMay 2020
Pages4229–4236
ISBN (Electronic)979-10-95546-34-4
Publication statusPublished - May 2020
MoE publication typeA4 Article in conference proceedings
EventLanguage Resources and Evaluation Conference - [LREC 2020 was cancelled]
Duration: 11 May 202016 May 2020
Conference number: 12
https://lrec2020.lrec-conf.org/

Fields of Science

  • 6121 Languages
  • parallel texts
  • morphemes
  • word alignment
  • 614 Theology
  • bible translation
  • exegetics
  • text criticism
  • text editions
  • 6122 Literature studies
  • most translated books
  • world literature
  • text editions

Cite this