Quantifying the impact of dirty OCR on historical text analysis: Eighteenth Century Collections Online as a case study

Mark John Hill, Simon Hengchen

Research output: Contribution to journalArticleScientificpeer-review

Original languageEnglish
JournalDigital Scholarship in the Humanities : DSH
Volume34
Issue number4
Pages (from-to)825–843
Number of pages19
ISSN2055-7671
DOIs
Publication statusPublished - 2019
MoE publication typeA1 Journal article-refereed

Fields of Science

  • 615 History and Archaeology
  • 6160 Other humanities
  • 113 Computer and information sciences

Projects

COMHIS: Helsinki Computational History Group

Tolonen, M., Hengchen, S., Kanner, A., Säily, T., Vaara, V., Ijaz, A., Roivainen, H., Hill, M. J., Ros, R., Marjanen, J., Mäkelä, E., Tiihonen, I. & Lahti, L.

SUOMEN AKATEMIA

01/01/2018 → …

Project: Research project

Activities

  • 1 Invited talk

Quantifying the impact of messy data on historical text analysis

Mark J. Hill (Speaker), & Simon Hengchen (Speaker)

4 Jul 2018

Activity: Talk or presentation typesInvited talk

Cite this