Sammanfattning
There is a critical point to be made about proper use of metadata in digital history. With any computational analysis of a large historical dataset, there is a strong temptation to approach the dataset as holistic representation of the language and intellectual landscape of its era. Digital history projects are often rightly criticised for having a naive approach to sources resulting in simplification of complex phenomena (Leca-Tsiomis, 2013; Bode, 2017). In this paper we demonstrate a way to avoid this, and how proper use of metadata is necessary for serious corpus control and digital source criticism.
This work makes two contributions to the history of the book and digital history. First, we present a methodological approach for creating a historical biographical database from a bibliographical catalogue. Second, we demonstrate solutions for forming a uniform dataset from a noisy and heterogenous starting point. This opens new opportunities which earlier historical research using bibliographical data has missed due to problems of data quality and coverage (Raven, 2007: 193). For example, while publisher networks had a greater impact on the distribution of ideas in early modern period than has been realised, publisher information as a source has not previously been extracted at scale, despite its potential to change the way we study intellectual history. Additionally, as this work is part of wider intellectual history research project, and the dataset produced here is combined with other bibliographical research strands, there are more general claims with regard to the utility of proper metadata in quantitative computational book history.
This work makes two contributions to the history of the book and digital history. First, we present a methodological approach for creating a historical biographical database from a bibliographical catalogue. Second, we demonstrate solutions for forming a uniform dataset from a noisy and heterogenous starting point. This opens new opportunities which earlier historical research using bibliographical data has missed due to problems of data quality and coverage (Raven, 2007: 193). For example, while publisher networks had a greater impact on the distribution of ideas in early modern period than has been realised, publisher information as a source has not previously been extracted at scale, despite its potential to change the way we study intellectual history. Additionally, as this work is part of wider intellectual history research project, and the dataset produced here is combined with other bibliographical research strands, there are more general claims with regard to the utility of proper metadata in quantitative computational book history.
Originalspråk | engelska |
---|---|
Status | Publicerad - 11 juli 2019 |
MoE-publikationstyp | Ej behörig |
Evenemang | DH2019: ADHO annual digital humanities conference - Utrecht, Holland Varaktighet: 8 juli 2019 → 12 juli 2019 https://dev.clariah.nl/files/dh2019/boa/0740.html |
Konferens
Konferens | DH2019 |
---|---|
Förkortad titel | DH2019 |
Land/Territorium | Holland |
Ort | Utrecht |
Period | 08/07/2019 → 12/07/2019 |
Internetadress |
Vetenskapsgrenar
- 615 Historia och arkeologi