Abstract
The Universal Dependencies (UD) project was conceived after the substantial recent interest in unifying annotation schemes across languages. With its own annotation principles and abstract inventory for parts of speech, morphosyntactic features and dependency relations, UD aims to facilitate multilingual parser development, cross-lingual learning, and parsing research from a language typology perspective. This paper presents the Turkish IMST-UD Treebank, the first Turkish treebank to be in a UD release. The IMST-UD Treebank was automatically converted from the IMST Treebank, which was also recently released. We describe this conversion procedure in detail, complete with mapping tables. We also present our evaluation of the parsing performances of both versions of the IMST Treebank. Our findings suggest that the UD framework is at least as viable for Turkish as the original annotation framework of the IMST Treebank.
| Original language | English |
|---|---|
| Title of host publication | Proceedings of COLING 2016, the 26th International Conference on Computational Linguistics: Technical Papers |
| Editors | Yuji Matsumoto, Rashmi Prasad |
| Number of pages | 11 |
| Place of Publication | Osaka, Japan |
| Publisher | The Association for Computational Linguistics |
| Publication date | Dec 2016 |
| Pages | 3444-3454 |
| ISBN (Electronic) | 978-4-87974-702-0 |
| Publication status | Published - Dec 2016 |
| MoE publication type | A4 Article in conference proceedings |
| Event | International Conference on Computational Linguistics - Osaka, Japan Duration: 11 Dec 2016 → 16 Dec 2016 Conference number: 26 |
Fields of Science
- 6121 Languages
- 113 Computer and information sciences
Datasets
-
IMST-UD Treebank
Sulubacak, U. (Creator), Gökırmak, M. (Contributor), Tyers, F. (Supervisor) & Eryiğit, G. (Data Manager), LINDAT/CLARIN, 15 May 2016
https://github.com/UniversalDependencies/UD_Turkish-IMST/tree/master and one more link, https://lindat.mff.cuni.cz/repository/xmlui/handle/11234/1-2515 (show fewer)
Dataset