Model Interpretability as Error Analysis for Variation Research

Aktiviteetti: Puhe- tai esitystyypitSuullinen esitys

Kuvaus

Language variety classification has seen significant advances in its results with accuracies of up to 100%. However, results largely depend on the similarity of the varieties and the scope of classification (see, for example, Aepli et al., 2022, 2023). Additionally, given black-box approaches used for classification, we don’t understand how models reach their classifications. This means that on the one hand we don’t know if the classifications are indeed based on dialect features and on the other hand we can’t trouble-shoot or influence classifications linguistically.

To that end, Xie et al. (2024) propose an approach to extract dialect features used for classification by training BERT-based dialect classifiers and using a post-hoc leave-one-word-out approach to detect lexical items that contribute most to the prediction probability of the sentence. These words can be assumed to be most characteristic of a particular dialect, which in turn allows us to understand if the relevant features are indeed dialectal in nature and advance models by proper error analysis. For this, we first test this approach on a corpus collected from the social media platform Jodel with data from Austria, Germany and Switzerland (Hovy & Purschke, 2018, Purschke & Hovy, 2019). We extend the original method, which is based on binary classification, to multiclass classification and extract dialect-specific features for various numbers of classes. This allows us to evaluate the features most relevant for the dialect classification. Additionally to Xie et al.’s original analysis, we also look at incorrect classifications and the features relevant to these decisions to understand the model classifications. Especially for this type of social media data we may find that the gold standard assumed by a posts geolocation may be somewhat contorted as people do not write the dialect of the region they are posting from.

Moreover, we work with Bosnian-Croatian-Montenegrin-Serbian social media data (Rupnik et al., 2023, Miletić & Miletić, 2024) as well as with reference corpora of historical German (https://www.deutschdiachrondigital.de/) to further evaluate how model interpretability is beneficial for error analysis. Overall, this advances our understanding of the reasoning behind model classifications and offers a way to analyse the classification errors more effectively.
Aikajakso11 syysk. 2024
Tapahtuman otsikkoLangues et langage à la croisée des disciplines
Tapahtuman tyyppiKonferenssi
Konferenssinumero1
SijaintiParis, RanskaNäytä kartalla
Tunnustuksen arvoKansainvälinen