Projects per year
Abstract
Suomen Kansan Vanhat Runot (Old Poems of the Finnish People) is a collection of nearly 90,000 oral folk poems written down between 1564 and the early 20th century. It is characterized by frequent reoccurrence of similar pieces of text on various levels (from entire poems, through passages to single verses and collocations). However, finding these similarities is challenging due to a high degree of orthographical, morphological, and compositional variation. In this article, we propose a method for automatically identifying equivalent verses, i.e. verses conveying the same meaning with the same words, using a clustering based on cosine similarity of character bigram vectors. The method achieves around 81% F-score and has been successfully used for identifying similarities across the entire SKVR corpus on the level of verse, passage, and poem. The results can be browsed through a Web interface.
Original language | English |
---|---|
Journal | Digital Scholarship in the Humanities |
Volume | 38 |
Issue number | 1 |
Pages (from-to) | 180–194 |
Number of pages | 15 |
ISSN | 2055-7671 |
DOIs | |
Publication status | Published - Apr 2023 |
MoE publication type | A1 Journal article-refereed |
Fields of Science
- 6122 Literature studies
- 113 Computer and information sciences
- 6160 Other humanities
- oral poetry
- folklore
- intertextuality
- variation
Projects
- 1 Finished
-
FILTER: Formulaic intertextuality, thematic networks and poetic variation across regional cultures of Finnic oral poetry (Academy of Finland research project no. 333138)
Kallio, K. (Project manager), Mäkelä, E. (Project manager), Janicki, M. M. (Participant), Saarinen, J. (Participant), Sarv, M. (Participant) & Kanner, A. (Participant)
01/09/2020 → 31/08/2024
Project: Research project