Forskningsresultat per år
Forskningsresultat per år
Professor of Digital Humanities
PL 24 (Unioninkatu 40)
00014
Finland
Finland
Forskningsoutput per år
Mikko Tolonen is professor of Digital Humanities at the Faculty of Arts at the University of Helsinki. He has a PhD in intellectual history (2010) and he is the PI and founder of the Helsinki Computational History Group (COMHIS) at the Department of Digital Humanities.
Tolonen's main research focus is on the Enlightenment Era and integrated interdisciplinary studies of public discourse, knowledge production and book & intellectual history that combines metadata from library catalogues as well as full-text and image libraries of books, newspapers and periodicals in early modern Europe. He is one of the editors of Hume's History of England for Oxford University Press. In 2016, he was awarded an Open Science and Research Award by Finnish Ministry of Education. In 2025 he gave the annual Voltaire Foundation lecture on Digital Enlightenment Studies at the University of Oxford.
In digital humanities education, he believes in project based teaching exemplified in the annual award-winning Helsinki Digital Humanities Hackathon that he founded in 2015. Tolonen is also the local PI at UH within two Marie Curie Training Networks (CASCADE and MECANO).
Tolonen is active in in development of infrastructures and collaborative networks. He has served in the executive board of European Association for Digital Humanities (EADH) and as the chair of Digital Humanities in the Nordic and Baltic Countries (DHNB). He has also done his share to advance humanities research infrastructure building in Finland (FIN-CLARIAH and DARIAH-FI).
Tolonen supervises work across COMHIS’s wide research spectrum and welcomes contact from talented early-career scholars keen to collaborate with his group. Former international postdocs from the group have since secured lecturer positions at leading European universities.
For interviews, see: CSC interview (in Finnish) & 375 humanists (in English and other languages) & for bluesky (bluesky (@tolonen.bsky.social) and twitter (@mikko_tolonen).
Mikko Tolonen’s work spans digital humanities, book history and intellectual history, focusing on early modern print culture and computational analysis. The COMHIS approach towards books as cultural artefacts, data and vehicles of meaning can be described as holistic (in other words, books are not treated merely as texts).
Tolonen's research has been published in central digital humanities forums (Journal of Cultural Analytics, Digital Scholarship in the Humanities, Historical Methods, Digital Enlightenment Studies, Journal of Open Humanities Data and main NLP related venues such as ACL). At the same time, he has been publishing also in well-established traditional journals, such as Historical Journal, Eighteenth-Century Studies, Huntington Library Quarterly, Explorations in Economic History and in books by Oxford University Press and Cambridge University Press. This versatility is crucial for computational history.
Publishing in best traditional forums functions as the litmus test for the relevance of computation in historical research. These results reflect COMHIS’s long-term strategy: most publications stem from integrated interdisciplinary work and are co-authored by group members, at times together with external partners.
Below, some of Tolonen's publications are grouped into thematic categories, with each category highlighting the evolution of computational methods (from early text-reuse detection to advanced machine learning) applied to those topics. Each category lists selected publications (with title and year and link to the full publication details including access to the article itself).
Tolonen studies early modern book history and print culture using large-scale bibliographic data. His work integrates library catalogues and addresses challenges of data quality and completeness in national bibliographies. Early studies introduced “bibliographic data science” to map publishing trends (e.g. book formats, vernacularization), while later research employs quantitative analysis (statistical and data-driven models) to uncover patterns in publication language, canon formation and economic aspects of the book trade.
This category covers intellectual and publishing networks of the Enlightenment and beyond. Tolonen applies social network analysis and bibliometric methods to understand how ideas and publications spread through networks of authors, publishers and communities. Using large-scale bibliographic datasets, his studies reveal structural patterns—for example, mapping the Scottish Enlightenment print ecosystem uncovered distinct publisher roles in Edinburgh vs. London and emphasized relational careers of key figures. Early work in this area introduced data-driven network analysis of Enlightenment publishing, and recent publications in leading journals contextualize major and marginal players in the knowledge distribution networks over time.
Tolonen’s research also explores newspaper archives and the emergence of a public sphere. He employs computational text analysis to study changing vocabularies, discourse dynamics, and the reach of periodicals across time and geography. Earlier studies analyzed basic features (language, location, publication frequency) of Finnish newspapers to delineate a national public sphere. Subsequently, more sophisticated natural language processing techniques were introduced. For example, one study used dependency parsing and neural word embeddings to trace how the concept of “nation” evolved semantically across four languages’ newspaper corpora. Such work illustrates a shift from manual analysis to data-driven methods that detect long-term conceptual changes and cross-lingual trends in public discourse.
A significant thread in Tolonen’s work is the study of text reuse and intertextuality, shedding light on how ideas and texts were circulated and re-purposed in the early modern period. Recognizing that matching shared passages across documents can reveal the spread and evolution of ideas, he helped develop computational tools for large-scale text reuse detection. The Reception Reader web tool, for instance, enables scholars to visually explore the reuse of texts in Early English Books Online and ECCO, revealing previously hidden patterns of reception across time. Tolonen’s publications trace a methodological progression from targeted case studies (e.g. comparing text similarity in one author’s works) to building optimized, big-data systems. A recent contribution reports on handling “billions of text reuse instances” from nearly all 18th-century printed texts, detailing how a hybrid data pipeline (SQL databases combined with Spark big-data processing) was optimized to support interactive humanities research.
Tolonen’s interdisciplinary work extends to language change and stylistic analysis in historical texts. Using corpora like ECCO and newspapers, these studies quantify how language and genres evolved over the Enlightenment and modern era. Earlier efforts applied statistical techniques to detect shifts within texts (for example, identifying genre changes or register shifts in multi-genre works). More recently, Tolonen’s collaborations have leveraged modern language models. One project trained a BERT-based model on 18th-century text data to predict publication dates. Other works similarly harness transformer models and distributional semantics to classify text registers and track domain-specific vocabulary change (e.g. the rise of economic terminology) over time. This trajectory illustrates the field’s shift from rule-based or manual analysis to explainable AI approaches in studying historical language.
Across all topics, Tolonen contributes to developing computational methods, tools and data workflows that advance digital humanities research. These publications focus on the infrastructure and methodological frameworks enabling large-scale analysis of historical data. In the bibliographic domain, Tolonen has helped formalize best practices for data curation and integration – from automatically determining edition groupings via metadata, to defining open workflows for multilingual library data (through DARIAH’s working groups on bibliographical data). He also explored statistical approaches to uncertainty in historical datasets, using probabilistic programming to model biases and gaps in sources. Collectively, these works show a progression toward more robust, scalable and transparent computational pipelines for humanities, often blending computer science techniques with domain-specific knowledge.
While focusing on use of computation in Digital Enlightenment Studies, Tolonen has also kept active publishing also more traditional pieces especially on the Scottish Enlightenment; actively working also in the archives thinking about the future and possibilities of the digital and computation.
Forskningsoutput: Tidskriftsbidrag › Artikel › Professionell
Forskningsoutput: Tidskriftsbidrag › Artikel › Vetenskaplig › Peer review
Forskningsoutput: Kapitel i bok/rapport/konferenshandling › Kapitel › Vetenskaplig › Peer review
Forskningsoutput: Bok/rapport › Bok › Vetenskaplig › Peer review
Forskningsoutput: Tidskriftsbidrag › Artikel › Vetenskaplig › Peer review
Tolonen, M. (Principal Investigator), Hengchen, S. (Deltagare), Kanner, A. (Deltagare), Säily, T. (Deltagare), Vaara, V. (Deltagare), Ijaz, A. (Deltagare), Roivainen, H. (Deltagare), Hill, M. J. (Deltagare), Ros, R. (Deltagare), Marjanen, J. (Deltagare), Mäkelä, E. (Deltagare), Tiihonen, I. (Deltagare), Lahti, L. (Deltagare), Tiihonen, I. L. I. (Projektledare), Moretti, A. (Projektledare), Vesalainen, A. J. K. (Projektledare), Rosson, D. E. (Projektledare), Wu, Y. (Projektledare), Shu, K. (Projektledare) & Hinderks, K. S. (Projektledare)
01/01/2018 → …
Projekt: Forskningsprojekt
Pereira da Silva, F. (Projektledare), Tolonen, M. (Projektledare), Fischer, J. P. (deltagare) & Jaksic, N. (deltagare)
01/03/2024 → 29/02/2028
Projekt: EU Horizon Europe: MSCA Doctoral Networks (HORIZON-TMA-MSCA-DN)
Tolonen, M. (Projektledare), Shu, K. (deltagare), Wu, Y. (deltagare) & Mäkelä, E. (Deltagare)
01/01/2024 → 31/12/2027
Projekt: EU Horizon Europe: MSCA Doctoral Networks (HORIZON-TMA-MSCA-DN)
Toivonen, H. T. (Principal Investigator), Tolonen, M. (Principal Investigator), Kaukonen, M. (Principal Investigator), Marjanen, J. (Deltagare), Granroth-Wilding, M. (Deltagare), Leppänen, L. (Deltagare), Avikainen, J. (Deltagare), Alnajjar, K. (Deltagare), Zosa, E. (Deltagare) & Hengchen, S. (Deltagare)
01/05/2018 → 31/01/2022
Projekt: Forskningsprojekt
Tolonen, M. (Principal Investigator), Lahti, L. (Projektledare), Salmi, H. (Principal Investigator), Salakoski, T. (Principal Investigator) & Kettunen, K. (Principal Investigator)
01/01/2016 → 31/12/2019
Projekt: Forskningsprojekt
Tolonen, M. (Skapad av), Journal of Open Humanities Data, 2023
DOI: 10.5334/johd.101, https://receptionreader.com/
Datauppsättning
Tolonen, M. (Skapad av), Rosson, D. E. (Skapad av), Vaara, V. (Skapad av), Mäkelä, E. (Skapad av), Mahadevan, A. (Skapad av) & Ryan, Y. C. (Skapad av), Journal of Open Humanities Data, 2023
Datauppsättning
Hengchen, S. (Skapad av), Ros, R. (Skapad av), Marjanen, J. (Skapad av) & Tolonen, M. (Skapad av), Zenodo, 19 dec. 2019
DOI: 10.5281/zenodo.3585027, https://zenodo.org/record/3585027
Datauppsättning
Tolonen, M. (!!Recipient), 2025
Pris: Pris och hedersbetygelser
Tolonen, M. (!!Recipient), Toivonen, H. T. (!!Recipient), Kaukonen, M. (!!Recipient), Marjanen, J. (!!Recipient) & Pivovarova, L. (!!Recipient), 2024
Pris: Pris och hedersbetygelser
Tolonen, M. (!!Recipient), Lahti, L. M. (!!Recipient), Vaara, V. P. (!!Recipient), Roivainen, H. H. M. (!!Recipient), Mäkelä, E. (!!Recipient) & Kanner, A. O. (!!Recipient), 2016
Pris: Pris och hedersbetygelser
Hill, M. J. (!!Recipient), Vaara, V. (!!Recipient), Säily, T. (!!Recipient), Lahti, L. (!!Recipient) & Tolonen, M. (!!Recipient), 8 mars 2019
Pris: Pris och hedersbetygelser
Mäkelä, E. (!!Recipient), Lagus, K. (!!Recipient), Lahti, L. (!!Recipient), Säily, T. (!!Recipient), Tolonen, M. (!!Recipient), Hämäläinen, M. (!!Recipient), Kaislaniemi, S. (!!Recipient) & Nevalainen, T. (!!Recipient), 23 okt. 2020
Pris: Pris och hedersbetygelser
Tolonen, M. (Besökande forskare)
Aktivitet: Typer för besök till extern institution › Akademisk besök på annan institution
Tolonen, M. (Ordförande i organisationskommitté)
Aktivitet: Typer för deltagande i eller organisering av evenemang › Arrangemang av och deltagande i konferens/workshop/kurs/seminarium
Tolonen, M. (Ordförande i organisationskommitté)
Aktivitet: Typer för deltagande i eller organisering av evenemang › Arrangemang av och deltagande i konferens/workshop/kurs/seminarium
Polvinen, M. (Värd) & Tolonen, M. (Värd)
Aktivitet: Typer för att vara värd för en besökare › Akademiskt besök på HU
Tolonen, M. S., Lahti, L., Vaara, V., Roivainen, H. H. M., Marjanen, J. P., Kanner, A. O., Vesanto, A., Ginter, F., Mäkelä, J. I. E., Hill, M. J., Säily, T. I., Ijaz, A. Z. & Hengchen, S.
23/01/2018
1 objekt av Mediabevakning
Press/media: !!Press / Media
Vaara, V., Tolonen, M., Hill, M. J., Ijaz, A. & Lahti, L.
29/02/2024
1 objekt av Mediabevakning
Press/media: !!Press / Media
16/01/2016
1 Mediabidrag
Press/media: !!Press / Media