Description

This dataset is a Late Babylonian model for Aleksi Sahala's BabyLemmatizer 2.1. This model is used for lemmatizing and POS-tagging the texts published as Linguistically Annotated Achemenet Babylonian Texts and BALT: Babylonian Administrative and Legal Texts. The training data consists of first-millennium Babylonian texts from Oracc.
Date made available6 Mar 2025
PublisherZenodo

Cite this