Abstract
This paper presents a knowledge graph created by transforming the plenary debates of the Parliament of Finland (1907-) into Linked Open Data (LOD). The data, totaling over 900 000 speeches, with automatically created semantic annotations and rich ontology-based metadata, are published in a Linked Open Data Service and are used via a SPARQL API and as data dumps. The speech data is part of larger LOD publication FinnParla that also includes prosopographical data about the politicians. The data is being used for studying parliamentary language and culture in Digital Humanities in several universities. To serve a wider variety of users, the entirety of this data was also produced using Parla-CLARIN markup. We present the first publication of all Finnish parliamentary debates as data. Technical novelties in our approach include the use of both Parla-CLARIN and an RDF schema developed for representing the speeches, integration of the data to a new Parliament of Finland Ontology for deeper data analyses, and enriching the data with a variety of external national and international data sources.
| Original language | English |
|---|---|
| Title of host publication | 3rd Conference on Language, Data and Knowledge, LDK 2021 |
| Editors | Dagmar Gromann, Gilles Serasset, Thierry Declerck, John P. McCrae, Jorge Gracia, Julia Bosque-Gil, Fernando Bobillo, Barbara Heinisch |
| Number of pages | 17 |
| Publisher | Schloss Dagstuhl- Leibniz-Zentrum fur Informatik GmbH, Dagstuhl Publishing |
| Publication date | 1 Aug 2021 |
| Pages | 1-17 |
| Article number | 8 |
| ISBN (Electronic) | 978-3-95977-199-3 |
| DOIs | |
| Publication status | Published - 1 Aug 2021 |
| MoE publication type | A4 Article in conference proceedings |
| Event | 3rd Conference on Language, Data and Knowledge, LDK 2021 - Zaragoza, Spain Duration: 1 Sept 2021 → 3 Sept 2021 |
Publication series
| Name | OpenAccess Series in Informatics |
|---|---|
| Volume | 93 |
| ISSN (Print) | 2190-6807 |
Bibliographical note
Funding Information:Acknowledgements Thanks to Ari Apilo, Sari Wilenius, and Päivikki Karhula of PoF for providing material for the project. Our work was funded by the Academy of Finland as part of the Semantic Parliament project, the EU project InTaVia: In/Tangible European Heritage1, and is related to the COST action NexusLinguarum2 on linguistic data science. CSC – IT Center for Science, Finland, provided computational resources for the work.
Publisher Copyright:
© Laura Sinikallio, Senka Drobac, Minna Tamper, Rafael Leal, Mikko Koho, Jouni Tuominen, Matti La Mela, and Eero Hyvönen; licensed under Creative Commons License CC-BY 4.0
Fields of Science
- Digital humanities
- Linked open data
- Parla-CLARIN
- Parliamentary data
- Plenary debates
- 113 Computer and information sciences