TY - JOUR
T1 - Let's Play Mono-Poly
T2 - BERT Can Reveal Words' Polysemy Level and Partitionability into Senses
AU - Gari Soler, Aina
AU - Apidianaki, Marianna
PY - 2021
Y1 - 2021
N2 - Pre-trained language models (LMs) encode rich information about linguistic structure but their knowledge about lexical polysemy remains unclear. We propose a novel experimental setup for analysing this knowledge in LMs specifically trained for different languages (English, French, Spanish and Greek) and in multilingual BERT. We perform our analysis on datasets carefully designed to reflect different sense distributions, and control for parameters that are highly correlated with polysemy such as frequency and grammatical category. We demonstrate that BERT-derived representations reflect words' polysemy level and their partitionability into senses. Polysemy-related information is more clearly present in English BERT embeddings, but models in other languages also manage to establish relevant distinctions between words at different polysemy levels. Our results contribute to a better understanding of the knowledge encoded in contextualised representations and open up new avenues for multilingual lexical semantics research.
AB - Pre-trained language models (LMs) encode rich information about linguistic structure but their knowledge about lexical polysemy remains unclear. We propose a novel experimental setup for analysing this knowledge in LMs specifically trained for different languages (English, French, Spanish and Greek) and in multilingual BERT. We perform our analysis on datasets carefully designed to reflect different sense distributions, and control for parameters that are highly correlated with polysemy such as frequency and grammatical category. We demonstrate that BERT-derived representations reflect words' polysemy level and their partitionability into senses. Polysemy-related information is more clearly present in English BERT embeddings, but models in other languages also manage to establish relevant distinctions between words at different polysemy levels. Our results contribute to a better understanding of the knowledge encoded in contextualised representations and open up new avenues for multilingual lexical semantics research.
KW - 6121 Languages
KW - 113 Computer and information sciences
U2 - 10.1162/tacl_a_00400
DO - 10.1162/tacl_a_00400
M3 - Article
SN - 2307-387X
VL - 9
SP - 825
EP - 844
JO - Transactions of the Association for Computational Linguistics (TACL)
JF - Transactions of the Association for Computational Linguistics (TACL)
ER -