Research output per year
Research output per year
Professor of Digital Humanities
PL 24 (Unioninkatu 40)
00014
Finland
Finland
Research activity per year
Mikko Tolonen is professor of Digital Humanities at the Faculty of Arts at the University of Helsinki. He has a PhD in intellectual history (2010) and he is the PI and founder of the Helsinki Computational History Group (COMHIS) at the Department of Digital Humanities.
Tolonen's main research focus is on the Enlightenment Era and integrated interdisciplinary studies of public discourse, knowledge production and book & intellectual history that combines metadata from library catalogues as well as full-text and image libraries of books, newspapers and periodicals in early modern Europe. He is one of the editors of Hume's History of England for Oxford University Press. In 2016, he was awarded an Open Science and Research Award by Finnish Ministry of Education. In 2025 he gave the annual Voltaire Foundation lecture on Digital Enlightenment Studies at the University of Oxford.
In digital humanities education, he believes in project based teaching exemplified in the annual award-winning Helsinki Digital Humanities Hackathon that he founded in 2015. Tolonen is also the local PI at UH within two Marie Curie Training Networks (CASCADE and MECANO).
Tolonen is active in in development of infrastructures and collaborative networks. He has served in the executive board of European Association for Digital Humanities (EADH) and as the chair of Digital Humanities in the Nordic and Baltic Countries (DHNB). He has also done his share to advance humanities research infrastructure building in Finland (FIN-CLARIAH and DARIAH-FI).
Tolonen supervises work across COMHIS’s wide research spectrum and welcomes contact from talented early-career scholars keen to collaborate with his group. Former international postdocs from the group have since secured lecturer positions at leading European universities.
For interviews, see: CSC interview (in Finnish) & 375 humanists (in English and other languages) & for bluesky (bluesky (@tolonen.bsky.social) and twitter (@mikko_tolonen).
Mikko Tolonen’s work spans digital humanities, book history and intellectual history, focusing on early modern print culture and computational analysis. The COMHIS approach towards books as cultural artefacts, data and vehicles of meaning can be described as holistic (in other words, books are not treated merely as texts).
Tolonen's research has been published in central digital humanities forums (Journal of Cultural Analytics, Digital Scholarship in the Humanities, Historical Methods, Digital Enlightenment Studies, Journal of Open Humanities Data and main NLP related venues such as ACL). At the same time, he has been publishing also in well-established traditional journals, such as Historical Journal, Eighteenth-Century Studies, Huntington Library Quarterly, Explorations in Economic History and in books by Oxford University Press and Cambridge University Press. This versatility is crucial for computational history.
Publishing in best traditional forums functions as the litmus test for the relevance of computation in historical research. These results reflect COMHIS’s long-term strategy: most publications stem from integrated interdisciplinary work and are co-authored by group members, at times together with external partners.
Below, some of Tolonen's publications are grouped into thematic categories, with each category highlighting the evolution of computational methods (from early text-reuse detection to advanced machine learning) applied to those topics. Each category lists selected publications (with title and year and link to the full publication details including access to the article itself).
Tolonen studies early modern book history and print culture using large-scale bibliographic data. His work integrates library catalogues and addresses challenges of data quality and completeness in national bibliographies. Early studies introduced “bibliographic data science” to map publishing trends (e.g. book formats, vernacularization), while later research employs quantitative analysis (statistical and data-driven models) to uncover patterns in publication language, canon formation and economic aspects of the book trade.
This category covers intellectual and publishing networks of the Enlightenment and beyond. Tolonen applies social network analysis and bibliometric methods to understand how ideas and publications spread through networks of authors, publishers and communities. Using large-scale bibliographic datasets, his studies reveal structural patterns—for example, mapping the Scottish Enlightenment print ecosystem uncovered distinct publisher roles in Edinburgh vs. London and emphasized relational careers of key figures. Early work in this area introduced data-driven network analysis of Enlightenment publishing, and recent publications in leading journals contextualize major and marginal players in the knowledge distribution networks over time.
Tolonen’s research also explores newspaper archives and the emergence of a public sphere. He employs computational text analysis to study changing vocabularies, discourse dynamics, and the reach of periodicals across time and geography. Earlier studies analyzed basic features (language, location, publication frequency) of Finnish newspapers to delineate a national public sphere. Subsequently, more sophisticated natural language processing techniques were introduced. For example, one study used dependency parsing and neural word embeddings to trace how the concept of “nation” evolved semantically across four languages’ newspaper corpora. Such work illustrates a shift from manual analysis to data-driven methods that detect long-term conceptual changes and cross-lingual trends in public discourse.
A significant thread in Tolonen’s work is the study of text reuse and intertextuality, shedding light on how ideas and texts were circulated and re-purposed in the early modern period. Recognizing that matching shared passages across documents can reveal the spread and evolution of ideas, he helped develop computational tools for large-scale text reuse detection. The Reception Reader web tool, for instance, enables scholars to visually explore the reuse of texts in Early English Books Online and ECCO, revealing previously hidden patterns of reception across time. Tolonen’s publications trace a methodological progression from targeted case studies (e.g. comparing text similarity in one author’s works) to building optimized, big-data systems. A recent contribution reports on handling “billions of text reuse instances” from nearly all 18th-century printed texts, detailing how a hybrid data pipeline (SQL databases combined with Spark big-data processing) was optimized to support interactive humanities research.
Tolonen’s interdisciplinary work extends to language change and stylistic analysis in historical texts. Using corpora like ECCO and newspapers, these studies quantify how language and genres evolved over the Enlightenment and modern era. Earlier efforts applied statistical techniques to detect shifts within texts (for example, identifying genre changes or register shifts in multi-genre works). More recently, Tolonen’s collaborations have leveraged modern language models. One project trained a BERT-based model on 18th-century text data to predict publication dates. Other works similarly harness transformer models and distributional semantics to classify text registers and track domain-specific vocabulary change (e.g. the rise of economic terminology) over time. This trajectory illustrates the field’s shift from rule-based or manual analysis to explainable AI approaches in studying historical language.
Across all topics, Tolonen contributes to developing computational methods, tools and data workflows that advance digital humanities research. These publications focus on the infrastructure and methodological frameworks enabling large-scale analysis of historical data. In the bibliographic domain, Tolonen has helped formalize best practices for data curation and integration – from automatically determining edition groupings via metadata, to defining open workflows for multilingual library data (through DARIAH’s working groups on bibliographical data). He also explored statistical approaches to uncertainty in historical datasets, using probabilistic programming to model biases and gaps in sources. Collectively, these works show a progression toward more robust, scalable and transparent computational pipelines for humanities, often blending computer science techniques with domain-specific knowledge.
While focusing on use of computation in Digital Enlightenment Studies, Tolonen has also kept active publishing also more traditional pieces especially on the Scottish Enlightenment; actively working also in the archives thinking about the future and possibilities of the digital and computation.
Research output: Contribution to journal › Article › Professional
Research output: Contribution to journal › Article › Scientific › peer-review
Research output: Chapter in Book/Report/Conference proceeding › Chapter › Scientific › peer-review
Research output: Book/Report › Book › Scientific › peer-review
Research output: Contribution to journal › Article › Scientific › peer-review
Tolonen, M. (Principal Investigator), Hengchen, S. (Participant), Kanner, A. (Participant), Säily, T. (Participant), Vaara, V. (Participant), Ijaz, A. (Participant), Roivainen, H. (Participant), Hill, M. J. (Participant), Ros, R. (Participant), Marjanen, J. (Participant), Mäkelä, E. (Participant), Tiihonen, I. (Participant), Lahti, L. (Participant), Tiihonen, I. L. I. (Project manager), Moretti, A. (Project manager), Vesalainen, A. J. K. (Project manager), Rosson, D. E. (Project manager), Wu, Y. (Project manager), Shu, K. (Project manager) & Hinderks, K. S. (Project manager)
01/01/2018 → …
Project: Research project
Pereira da Silva, F. (Project manager), Tolonen, M. (Project manager), Fischer, J. P. (Participant) & Jaksic, N. (Participant)
01/03/2024 → 29/02/2028
Project: EU MSCA Doctoral Networks (TMA-MSCA-DN)
Tolonen, M. (Project manager), Shu, K. (Participant), Wu, Y. (Participant) & Mäkelä, E. (Participant)
01/01/2024 → 31/12/2027
Project: EU MSCA Doctoral Networks (TMA-MSCA-DN)
Toivonen, H. T. (Principal Investigator), Tolonen, M. (Principal Investigator), Kaukonen, M. (Principal Investigator), Marjanen, J. (Participant), Granroth-Wilding, M. (Participant), Leppänen, L. (Participant), Avikainen, J. (Participant), Alnajjar, K. (Participant), Zosa, E. (Participant) & Hengchen, S. (Participant)
01/05/2018 → 31/01/2022
Project: Research project
Tolonen, M. (Principal Investigator), Lahti, L. (Project manager), Salmi, H. (Principal Investigator), Salakoski, T. (Principal Investigator) & Kettunen, K. (Principal Investigator)
01/01/2016 → 31/12/2019
Project: Research project
Tolonen, M. (Creator), Journal of Open Humanities Data, 2023
DOI: 10.5334/johd.101, https://receptionreader.com/
Dataset
Tolonen, M. (Creator), Rosson, D. E. (Creator), Vaara, V. (Creator), Mäkelä, E. (Creator), Mahadevan, A. (Creator) & Ryan, Y. C. (Creator), Journal of Open Humanities Data, 2023
Dataset
Hengchen, S. (Creator), Ros, R. (Creator), Marjanen, J. (Creator) & Tolonen, M. (Creator), Zenodo, 19 Dec 2019
DOI: 10.5281/zenodo.3585027, https://zenodo.org/record/3585027
Dataset
Tolonen, M. (Recipient), 2025
Prize: Prizes and awards
Tolonen, M. (Recipient), Toivonen, H. T. (Recipient), Kaukonen, M. (Recipient), Marjanen, J. (Recipient) & Pivovarova, L. (Recipient), 2024
Prize: Prizes and awards
Tolonen, M. (Recipient), Lahti, L. M. (Recipient), Vaara, V. P. (Recipient), Roivainen, H. H. M. (Recipient), Mäkelä, E. (Recipient) & Kanner, A. O. (Recipient), 2016
Prize: Prizes and awards
Hill, M. J. (Recipient), Vaara, V. (Recipient), Säily, T. (Recipient), Lahti, L. (Recipient) & Tolonen, M. (Recipient), 8 Mar 2019
Prize: Prizes and awards
Mäkelä, E. (Recipient), Lagus, K. (Recipient), Lahti, L. (Recipient), Säily, T. (Recipient), Tolonen, M. (Recipient), Hämäläinen, M. (Recipient), Kaislaniemi, S. (Recipient) & Nevalainen, T. (Recipient), 23 Oct 2020
Prize: Prizes and awards
Tolonen, M. (Visiting researcher)
Activity: Visiting an external institution types › Academic visit to other institution
Tolonen, M. (Chair of organizing committee)
Activity: Participating in or organising an event types › Organisation and participation in conferences, workshops, courses, seminars
Tolonen, M. (Chair of organizing committee)
Activity: Participating in or organising an event types › Organisation and participation in conferences, workshops, courses, seminars
Polvinen, M. (Host) & Tolonen, M. (Host)
Activity: Hosting a visitor types › Academic visit at UH
Tolonen, M. S., Lahti, L., Vaara, V., Roivainen, H. H. M., Marjanen, J. P., Kanner, A. O., Vesanto, A., Ginter, F., Mäkelä, J. I. E., Hill, M. J., Säily, T. I., Ijaz, A. Z. & Hengchen, S.
23/01/2018
1 item of Media coverage
Press/Media: Press / Media
Vaara, V., Tolonen, M., Hill, M. J., Ijaz, A. & Lahti, L.
29/02/2024
1 item of Media coverage
Press/Media: Press / Media
17/11/2020
1 Media contribution
Press/Media: Press / Media
16/01/2016
1 Media contribution
Press/Media: Press / Media