Projects per year
Abstract
This paper is a part of a collaboration between computer scientists and historians aimed at development of novel methods for historical newspapers analysis. We present a case study of ideological termsending with -ism suffix in nineteenthcentury Finnish newspapers. We propose a two-step procedure to trace differences in word usages over time: trainingof diachronic embeddings on several timeslices and when clustering embeddings ofselected words together with their neighbours to obtain historical context. The obtained clusters turn out to be useful for historical studies. The paper also discusses specific difficulties related to development of historian-oriented tools.
Original language | English |
---|---|
Title of host publication | Workshop on Language Technology for Digital Historical Archives : with a Special Focus on Central-, (South-)Eastern Europe, Middle East and North Africa (LT-DHA2019) |
Number of pages | 8 |
Place of Publication | Red Hook, NY |
Publisher | Curran Associates Inc. |
Publication date | 2019 |
Pages | 3-10 |
ISBN (Print) | 978-1-7138-0298-3 |
DOIs | |
Publication status | Published - 2019 |
MoE publication type | A4 Article in conference proceedings |
Event | Workshop on Language Technology for Digital Historical Archives : with a Special Focus on Central-, (South-)Eastern Europe, Middle East and North Africa (LT-DHA2019) - Varna, Bulgaria Duration: 5 Aug 2019 → 5 Sep 2019 https://www.inf.uni-hamburg.de/inst/dmp/hercore/publications/ltdha.html |
Fields of Science
- 113 Computer and information sciences
Projects
- 2 Finished
-
Embeddia: Cross-Lingual Embeddings for Less-Represented Languages in European News Media
Toivonen, H. T., Linden, C., Granroth-Wilding, M., Leppänen, L., Alnajjar, K. & Zosa, E.
01/01/2019 → 31/03/2022
Project: Research project
-
NewsEye: A Digital Investigator for Historical Newspapers
Toivonen, H. T., Tolonen, M., Kaukonen, M., Marjanen, J., Granroth-Wilding, M., Leppänen, L., Avikainen, J., Alnajjar, K., Zosa, E. & Hengchen, S.
01/05/2018 → 31/01/2022
Project: Research project