This paper describes ongoing work towards a rich analysis of the social contexts of neologism use in historical corpora, in particular the Corpora of Early English Correspondence, with research questions concerning the innovators, meanings and diffusion of neologisms. To enable this kind of study, we are developing new processes, tools and ways of combining data from different sources, including the Oxford English Dictionary, the Historical Thesaurus, and contemporary published texts. Comparing neologism candidates across these sources is complicated by the large amount of spelling variation. To make the issues tractable, we start from case studies of individual suffixes (-ity, -er) and people (Thomas Twining). By developing tools aiding these studies, we build toward more general analyses. Our aim is to develop an open-source environment where information on neologism candidates is gathered from a variety of algorithms and sources, pooled, and presented to a human evaluator for verification and exploration.
- 6121 Kielitieteet
STRATAS: Interfacing structured and unstructured data in sociolinguistic research on language change/Tekstin ja rakenteisen tiedon yhdistäminen kielenmuutoksen sosiolingvistisessä tutkimuksessa
01/01/2016 → 31/12/2019