TY - ADVS
T1 - SUS Fieldwork: SUST 77 v0.1, Korp
AU - Rueter, Jack
AU - Erina, Olga
AU - Axelson, Erik
N1 - Mordwinische Volksdichtung. Gesamm. von H. Paasonen. Hrsg. und übers. von Paavo Ravila. I. Band. SUST LXXVII. 1938. XXVI + 509 p. (partial)
PY - 2022/8
Y1 - 2022/8
N2 - This Korp publication contains the first 25 texts of the Finno-Ugrian Society's SUST 77 Erzya-language folklore texts. The texts included consist of (1) original fieldwork transliterations; (2) automatically normalized Cyrillic text, and (3) original German-language translations. The texts have been aligned, such that the normalized versions of the original transliterations of the texts are used in the search engine. The word forms have been analyzed using a finite-state analyzer developed by Jack Rueter in collaboration with Universities in Tromso, Norway, Saransk, Mordovia and Turku, Finland. Subsequently, the words have been disambiguated and the sentences analyzed using a constraint-grammar description with the same consultancies and collaborations. In the search engine, one can locate words according to string and feature values of the morphological analyses. Once a word form is located and selected, important bibliographical information may be found in the right margin along with a word analysis and dependency tree representation. Hence the entire sentence is also shown in the original transliteration along with a German-language translation. The bibliographical information includes the identifying text number, the page range of the text, and page where the sentence is found, on the one hand, and speaker name as well as locale and date, on the other. This version demonstrates the present state of normalization, finite-state disambiguated morphological analyses and constraint-grammar syntactic analyses with a conversion to feature and dependency marking used in the Universal Dependencies project. The analyzer has been designed to recognize dialect variation and therefore may exceed the normalization.
AB - This Korp publication contains the first 25 texts of the Finno-Ugrian Society's SUST 77 Erzya-language folklore texts. The texts included consist of (1) original fieldwork transliterations; (2) automatically normalized Cyrillic text, and (3) original German-language translations. The texts have been aligned, such that the normalized versions of the original transliterations of the texts are used in the search engine. The word forms have been analyzed using a finite-state analyzer developed by Jack Rueter in collaboration with Universities in Tromso, Norway, Saransk, Mordovia and Turku, Finland. Subsequently, the words have been disambiguated and the sentences analyzed using a constraint-grammar description with the same consultancies and collaborations. In the search engine, one can locate words according to string and feature values of the morphological analyses. Once a word form is located and selected, important bibliographical information may be found in the right margin along with a word analysis and dependency tree representation. Hence the entire sentence is also shown in the original transliteration along with a German-language translation. The bibliographical information includes the identifying text number, the page range of the text, and page where the sentence is found, on the one hand, and speaker name as well as locale and date, on the other. This version demonstrates the present state of normalization, finite-state disambiguated morphological analyses and constraint-grammar syntactic analyses with a conversion to feature and dependency marking used in the Universal Dependencies project. The analyzer has been designed to recognize dialect variation and therefore may exceed the normalization.
KW - 6121 Språkvetenskaper
KW - Erzya language
KW - Morphological annotation
KW - Finno-Ugrian Society
KW - Paasonen Mordwische Volksdichtung
KW - HFST
KW - Giellalt
KW - Finite-state morphology
KW - constraint grammar
KW - Uralic languages
UR - https://korp.csc.fi/korp-test/sus-fieldwork/?mode=other_languages#?corpus=sus_fieldwork_myv&cqp=%5B%5D&prefix
M3 - Programvara
PB - Kielipankki
CY - Helsinki
ER -