Projects per year
Organisation profile
Language technology is a multidisciplinary field. It often comes with the label computational linguistics, natural language processing (NLP) or natural language engineering (NLE). In language technology we study methods and develop models and tools for processing human language. This includes models for natural language understanding and human language generation also across languages. In Helsinki we focus on
- Cross-lingual NLP including machine translation
- NLP for languages with a rich morphology
- NLP for low-resource languages and in the humanities
Activities and news from our research group are available at our website.
Fields of Science
- 113 Computer and information sciences
- language technology
- natural language processing
- natural language engineering
- 6121 Languages
- computational linguistics
- language technology
International and National Collaboration
Profiles
-
Mikko Aulamo
- Department of Digital Humanities - Doctoral Researcher
- Language Technology
- Doctoral Programme in Language Studies
Person: U1 Research and teaching staff, Doctoral Researcher
-
Hande Celikkanat
- Department of Digital Humanities - University Researcher
- Language Technology
Person: U3 Research and teaching staff
-
Mathias Creutz
- Department of Digital Humanities - Senior University Lecturer, Title of Docent
- Language Technology
Person: U3 Research and teaching staff
Equipment
-
Helsinki Term Bank for the Arts and Sciences, former Bank of Finnish Terminology in Arts and Sciences
Tiina Onikki-Rantajääskö (Manager), Antti Olavi Kanner (Operator), Niklas Mikael Laxström (Operator), Eeva Johanna Enqvist (Other) & Harri Kettunen (Other)
Department of Finnish, Finno-Ugrian and Scandinavian StudiesFacility/equipment: Equipment
-
Language Bank of Finland during the period 2005-2008.
Anssi Yli-Jyrä (Manager)
Language TechnologyFacility/equipment: Core Facility
-
nVidia GTX Titan X GPU Workstation in the Department of Digital Humanities at Metsätalo
Anssi Yli-Jyrä (Manager)
Language TechnologyFacility/equipment: Equipment
-
nVidia RTX 2080Ti GPU for a Workstation at the Department of Digital Humanities, Metsätalo
Anssi Yli-Jyrä (Manager)
Language TechnologyFacility/equipment: Equipment
-
GreenNLP: Green NLP - controlling the carbon footprint in sustainable language technology
Tiedemann, J. & Nieminen, T. J.
Suomen Akatemia Projektilaskutus
01/01/2023 → 31/12/2025
Project: Academy of Finland: Targeted Academy Project
-
Rapporteur to chart the state of the Finnish language
Onikki-Rantajääskö, T. & Kanner, A.
01/11/2022 → 30/04/2024
Project: Ministry funding
-
High Performance Language Technologies
Tiedemann, J., Aulamo, M., Ji, S. & Virpioja, S. P.
Charles University in Prague Faculty of Science Department of Teaching and Didactics of Biology
01/09/2022 → 31/08/2025
Project: EU Horizon Europe: Innovation actions (HORIZON-IA)
-
Uncertainty-aware neural language models
Tiedemann, J., Celikkanat, H., Virpioja, S. P. & Vazquez , R.
Academy of Finland, Suomen Akatemia Projektilaskutus
01/01/2022 → 01/10/2025
Project: Research project
-
-
Automatic detection of place and time for Greek texts in Egypt
Jauhiainen, T., Henriksson, E., Vierros, M. & Jauhiainen, H., 2023.Research output: Conference materials › Poster › peer-review
Open AccessFile -
Automatic text simplification of Russian texts using control tokens
Dmitrieva, A., May 2023, Proceedings of the 9th Workshop on Slavic Natural Language Processing 2023 (SlavicNLP 2023). Piskorski, J., Marcińczuk, M. & Nakov, et al., P. (eds.). Stroudsburg: Association for Computational Linguistics (ACL), p. 70-77 8 p.Research output: Chapter in Book/Report/Conference proceeding › Conference contribution › Scientific › peer-review
Open AccessFile -
Character Alignment Methods for Dialect-to-Standard Normalization
Scherrer, Y., 1 Jul 2023, Proceedings of the 20th SIGMORPHON workshop on Computational Research in Phonetics, Phonology, and Morphology. Nicolai, G., Chodroff, E., Mailhot, F. & Çöltekin, Ç. (eds.). Stroudsburg: The Association for Computational Linguistics, p. 110-116 7 p.Research output: Chapter in Book/Report/Conference proceeding › Conference contribution › Scientific › peer-review
Open AccessFile -
Creating a parallel Finnish—Easy Finnish dataset from news articles
Dmitrieva, A. & Konovalova, A., Jun 2023, Proceedings of the 1st Workshop on Open Community-Driven Machine Translation. Esplá-Gomis, M., Forcada, M., Kuzman, T., Ljubešić, N., van Noord, R., Ramírez-Sánchez, G., Tiedemann, J. & Toral, A. (eds.). Universitat d’Alacant, p. 21-26 6 p.Research output: Chapter in Book/Report/Conference proceeding › Conference contribution › Scientific › peer-review
Open AccessFile -
Detection and attribution of quotes in Finnish news media: BERT vs. rule-based approach
Janicki, M., Kanner, A. & Mäkelä, E., May 2023, Proceedings of the 24th Nordic Conference on Computational Linguistics (NoDaLiDa). Alumäe, T. & Fishel, M. (eds.). Tartu: University of Tartu Library, p. 52-59 8 p. (NEALT Proceedings Series; no. 52).Research output: Chapter in Book/Report/Conference proceeding › Conference contribution › Scientific › peer-review
Open AccessFile
Activities
-
Corpus-based computational dialectology - Data, methods and results
Yves Scherrer (Speaker)
5 Jun 2023Activity: Talk or presentation types › Invited talk
File -
Digital Humanities in the Nordic and Baltic Countries 2023 (Event)
Tommi Jauhiainen (Reviewer)
2023 → …Activity: Publication peer-review and editorial work types › Peer review of manuscripts
-
Automatic Language Identification: General Introduction and Applications to Ancient Texts
Tommi Jauhiainen (Speaker)
8 May 2023Activity: Talk or presentation types › Oral presentation
-
Using Word Embeddings for Identifying Emotions Relating to the Body in a Neo-Assyrian Corpus
Eleanor Rose Bennett (Speaker) & Aleksi Sahala (Speaker)
8 Sep 2023Activity: Talk or presentation types › Oral presentation
-
Finnish Journal of Linguistics (Journal)
Olli Vilhelm Kuparinen (Reviewer)
2023 → …Activity: Publication peer-review and editorial work types › Peer review of manuscripts
Prizes
-
August Ahlqvistin, Yrjö Wichmannin, Kai Donnerin ja Artturi Kanniston rahastojen väitöskirjapalkinto
Kuparinen, Olli Vilhelm (Recipient), 14 Mar 2022
Prize: Prizes and awards
-
Best paper award at DHN 2020
Mäkelä, Eetu (Recipient), Lagus, Krista (Recipient), Lahti, Leo (Recipient), Säily, Tanja (Recipient), Tolonen, Mikko (Recipient), Hämäläinen, Mika (Recipient), Kaislaniemi, Samuli (Recipient) & Nevalainen, Terttu (Recipient), 23 Oct 2020
Prize: Prizes and awards
-
-
-
Tampereen yliopiston tukisäätiön väitöskirjastipendi
Kuparinen, Olli Vilhelm (Recipient), 2022
Prize: Prizes and awards
Datasets
-
Murreviikko: an Annotated and Normalized Corpus of Dialectal Finnish Tweets
Kuparinen, O. V. (Creator), Zenodo, 2023
Dataset
-
OcWikiAnnot: Annotated Wikipedia Corpus of Occitan
Miletic Haddad, A. (Creator), Zenodo, 20 Apr 2023
DOI: 10.5281/zenodo.7777340, https://doi.org/10.5281/zenodo.7777340
Dataset
-
OcWikiDisc: a Corpus of Wikipedia Talk Pages in Occitan
Miletic Haddad, A. (Creator) & Scherrer, Y. (Creator), Zenodo, 14 Sep 2022
DOI: 10.5281/zenodo.7079580, https://doi.org/10.5281/zenodo.7079580
Dataset
-
ANEE Lexical Networks v. 2.0 - the Dataset
Sahala, A. (Creator), Jauhiainen, H. (Creator), Alstola, T. (Creator), Hardwick, S. (Creator), Bennett, E. R. (Creator), Jauhiainen, T. (Creator), Svärd, S. (Creator) & Linden, K. (Creator), Zenodo, 29 Sep 2022
Dataset
-
ANEE Lexical Networks v. 2.0
Sahala, A. (Creator), Jauhiainen, H. (Creator), Alstola, T. (Creator), Hardwick, S. (Creator), Bennett, E. R. (Creator), Jauhiainen, T. (Creator), Linden, K. (Creator) & Svärd, S. (Creator), University of Helsinki, 29 Sep 2022
http://urn.fi/urn:nbn:fi:lb-2022100301
Dataset
Press/Media
-
-
Språk(teknologi) är nyckeln till intelligens och rättvisa
20/01/2022
1 Media contribution
Press/Media: Press / Media
-
芬兰研究人员正在教人工智能讲流利的芬兰语方言
Mika Hämäläinen, Khalid Alnajjar, Jack Rueter & Niko Partanen
10/01/2022
1 item of Media coverage
Press/Media: Press / Media
-
Inteligência artificial identifica 23 dialetos em finlandês
Mika Hämäläinen, Khalid Alnajjar, Jack Rueter & Niko Partanen
17/12/2021
1 item of Media coverage
Press/Media: Press / Media
-
Researchers teach artificial intelligence to be fluent in Finnish dialects
Mika Hämäläinen, Khalid Alnajjar, Niko Partanen & Jack Rueter
16/12/2021
1 Media contribution
Press/Media: Press / Media