Projects per year
Abstract
Open-source analyzer dictionary development is being implemented for Skolt
Sami, Ingrian, Moksha-Mordvin, etc. in the Helsinki CSC infrastructure; home
of the Finnish Kielipankki ’Language Bank’ and Termipankki ’Term Bank’. The
proximity of minority-language corpora in need of annotation and the multiple
usage of controlled wikimedia-type dictionaries make CSC an attractive site for
synchronized transducer dictionary development. The open-source FST develop-
ment of Uralic and other minority languages at Giellatekno-Divvun in Tromsø
demonstrates a vast potential for reusage of FST-s, only augmented by open-
source work in OmorFi, Apertium and Universal Dependency <http://univer-
saldependencies.org/#language-urj>. The initial idea is to allow synchronized
editing of Giellatekno xml and CSC wiki structures via github. In addition to
allowing for simple lexc LEMMA:STEM CONTINUATION_LEXICON ”TRANS-
LATION” ; line exports, the parallel dictionaries will provide for documentation
of derivation, morpho-syntactic information on valency and government, seman-
tics and etymology.
Original language | English |
---|---|
Title of host publication | 3rd International Workshop for Computational Linguistics of Uralic Languages (IWCLUL 2017) : St. Petersburg, Russia 23 – 24 January 2017 |
Editors | Francis M. Tyers, Michael Rießler, Tommi A. Pirinen , Trond Trosterud |
Number of pages | 7 |
Place of Publication | Stroudsburg |
Publisher | The Association for Computational Linguistics |
Publication date | 2017 |
Pages | 1-7 |
Article number | 2 |
ISBN (Print) | 978-1-5108-3665-5 |
DOIs | |
Publication status | Published - 2017 |
MoE publication type | A4 Article in conference proceedings |
Event | International Workshop for Computational Linguistics of Uralic Languages - St. Petersburg, Russian Federation Duration: 23 Jan 2017 → 24 Jan 2017 Conference number: 3 |
Fields of Science
- 6121 Languages
- Open-source
- Analyzer dictionary development
- Wiki-based dictionary
- Synchronized dictionary editing
- Uralic Languages
- Semantics
- Morphology
- Morpho-syntactic data
- Etymology
Projects
- 1 Finished
-
Koltansaamen elvytys kieliteknologia-avusteisen kielenoppimisohjelmien avulla sekä mallin ja ohjeiden laatiminen menetelmän siirtämiseksi toisiin uhanalaisiin kieliin
Rueter, J. (Project manager), Hämäläinen, M. (Participant), Koponen, E. (Participant) & Lehtinen, M. (Participant)
01/01/2015 → 31/12/2017
Project: Research project
Activities
-
Mari FST and Corpus work
Rueter, J. (Consultant), Trosterud, T. (Consultant) & Bradley, J. (Consultant)
6 Jan 2019 → 8 Jan 2019Activity: Consultancy types › Consultancy
-
Research Data and Humanities 2019
Rueter, J. (Speaker: Presenter), Hämäläinen, M. (Speaker: Presenter) & Alnajjar, K. (Speaker: Presenter)
14 Aug 2019 → 16 Aug 2019Activity: Participating in or organising an event types › Organisation and participation in conferences, workshops, courses, seminars
-
Kindred People's Day Conference 2019
Rueter, J. (Speaker)
11 Oct 2019Activity: Talk or presentation types › Oral presentation
File