Kuvaus
Occitan is a regional language spoken in southern France and in parts of Italy and Spain. Like many such languages, it has only recently started to enter the digital era. Basic digital tools and resources (text databases, electronic dictionaries, text-to-speech tools) have been created and Occitan Wikipedia is also being developed.We present OcWikiDisc, a 500,000-word corpus extracted from Occitan Wikipedia’s discussion pages. It contains direct user-to-user interactions on various topics. We analyze Occitan dialects and spelling norms on a corpus sample in a first attempt to model the use of Occitan on this medium.
Aikajakso | 6 lokak. 2022 |
---|---|
Tapahtuman otsikko | 8th Estonian Digital Humanities Conference: Shifts in language and culture: computational approaches to variation and change |
Tapahtuman tyyppi | Konferenssi |
Sijainti | Tallinn, ViroNäytä kartalla |
Tunnustuksen arvo | Kansainvälinen |
Asiakirjat ja linkit
Tähän liittyvä sisältö
-
Projektit