Numerals and what counts

Jack Rueter, Niko Partanen, Tommi A Pirinen

Research output: Chapter in Book/Report/Conference proceedingConference contributionScientificpeer-review


This study discusses the way different numerals and related expressions are currently annotated in the Universal Dependencies project, with a specific focus on the Uralic language family and only occasional references to the other language groups. We analyse different annotation conventions between individual treebanks, and aim to highlight some areas where further development work and systematization could prove beneficial. At the same time, the Universal Dependencies project already offers a wide range of conventions to mark nuanced variation in numerals and counting expressions, and the harmonization of conventions between different languages could be the next step to take. The discussion here makes specific reference to Universal Dependencies version 2.8, and some differences found may already have been harmonized in version 2.9. Regardless of whether this takes place or not, we believe that the study still forms an important documentation of this period in the project.
Original languageEnglish
Title of host publicationFifth Workshop on Universal Dependencies : Proceedings
EditorsMiryam de Lhoneux, Reut Tsarfaty
Number of pages9
Place of PublicationStroudsburg
PublisherThe Association for Computational Linguistics
Publication dateDec 2021
ISBN (Electronic)978-1-955917-17-9
Publication statusPublished - Dec 2021
MoE publication typeA4 Article in conference proceedings
EventWorkshop on Universal Dependencies: UDW, SyntaxFest 2021 - [Online event], Sofia
Duration: 21 Mar 202225 Mar 2022
Conference number: 6

Fields of Science

  • 6121 Languages
  • universal dependencies
  • numerals
  • treebanks
  • Morphological annotation
  • Uralic languages
  • Erzya language
  • Moksha language
  • Olonets-Karelian
  • Karelian language
  • Komi-Zyrian
  • Komi-Permyak language
  • Finnish language
  • Estonian Language
  • Skolt Sami language
  • Northern Sami language
  • Hungarian language
  • syntax

Cite this