Word Clustering for Historical Newspapers Analysis

Tutkimustuotos: Artikkeli kirjassa/raportissa/konferenssijulkaisussaKonferenssiartikkeliTieteellinenvertaisarvioitu

Abstrakti

This paper is a part of a collaboration between computer scientists and historians aimed at development of novel methods for historical newspapers analysis. We present a case study of ideological termsending with -ism suffix in nineteenthcentury Finnish newspapers. We propose a two-step procedure to trace differences in word usages over time: trainingof diachronic embeddings on several timeslices and when clustering embeddings ofselected words together with their neighbours to obtain historical context. The obtained clusters turn out to be useful for historical studies. The paper also discusses specific difficulties related to development of historian-oriented tools.
Alkuperäiskielienglanti
OtsikkoWorkshop on Language Technology for Digital Historical Archives : with a Special Focus on Central-, (South-)Eastern Europe, Middle East and North Africa (LT-DHA2019)
Sivumäärä8
JulkaisupaikkaRed Hook, NY
KustantajaCurran Associates Inc.
Julkaisupäivä2019
Sivut3-10
ISBN (painettu)978-1-7138-0298-3
DOI - pysyväislinkit
TilaJulkaistu - 2019
OKM-julkaisutyyppiA4 Artikkeli konferenssijulkaisuussa
TapahtumaWorkshop on Language Technology for Digital Historical Archives : with a Special Focus on Central-, (South-)Eastern Europe, Middle East and North Africa (LT-DHA2019) - Varna, Bulgaria
Kesto: 5 elokuuta 20195 syyskuuta 2019
https://www.inf.uni-hamburg.de/inst/dmp/hercore/publications/ltdha.html

Tieteenalat

  • 113 Tietojenkäsittely- ja informaatiotieteet

Siteeraa tätä