Building and use of language technology (BAULT)



The aim of the research community (RC) is to make language resources (i.e. materials and tools) seamlessly available for the researchers, develop and facilitate new types of language oriented research and develop new multilingual computational methods for processing language materials.

The group includes the FIN-CLARIN project which is listed in the national roadmap of the 20 new infrastructures to be built and is a component in the European CLARIN (Common Language Resource and Technology Infrastructure) of the ESFRI roadmap. The services of FIN-CLARIN, including the Language Bank of Finland are hosted by the CSC IT Center for Science, Ltd.

FIN-CLARIN and similar services are used for linguistic research when studying language structure, its use and variation and when developing language technological applications such as spellers, parsers and machine translation.

In particular, the University of Helsinki is widely known of the leading edge research on finite-state parsing methods. The morphological two-level model has some 1000 references in Google Scholar. The recent Helsinki Finite State Technology (HFST) project has combined several international research groups to cooperate under the HFST platform.

Practically all members of the RC participate (or have participated) as supervisors or students in the Langnet national PhD graduate school. Another national graduate school, the Finnish Graduate School of Language Technology (2002-2009) was participated by many members of the RC and was merged with Langnet in 2009. Many linguistic PhD students make use of the language resources whereas language technology students develop methods for natural language processing. All UH PhD and MA students of Russian are taught to use sophisticated language databases such as Integrum and Russian National Corpus.

Responsible person: Kimmo Koskenniemi, Department of Modern Languages

Participation category: 3
Gällande start-/slutdatum01/01/200531/12/2016


Two-level morphology

Koskenniemi, K.

01/04/1981 → …

Projekt: Forskningsprojekt

CLARA - Common Language Resources and their Applications

Koskenniemi, K. & Linden, K.

Unknown funder


Projekt: Forskningsprojekt

Common Language Resources and Technology Infrastructure (CLARIN)

Koskenniemi, K., Linden, K., Oksanen, V. & Yangarber, R.


Projekt: Forskningsprojekt


On Practical Realisation of Autosegmental Representations in Lexical Transducers of Tonal Bantu Languages

Yli-Jyrä, A., 13 jan 2020, (Insänt) LT4ALL. UNESCO, 4 s.

Forskningsoutput: Kapitel i bok/rapport/konferenshandlingKonferensbidragVetenskapligPeer review

Optimal Kornai-Karttunen Codes for Restricted Autosegmental Representations

Yli-Jyrä, A. M., 2019, Tokens of Meaning: Papers in Honor of Lauri Karttunen. Condoravdi, C. & Holloway King, T. (red.). Stanford: Center for the Study of Language and Information (CSLI)

Forskningsoutput: Kapitel i bok/rapport/konferenshandlingKapitelVetenskapligPeer review

Bounded-Depth High-Coverage Search Space for Noncrossing Parses

Yli-Jyrä, A. M., 4 sep 2017, Proceedings of the 13th International Conference on Finite State Methods and Natural Language Processing: FSMNLP 2017. Drewes, F. (red.). Stroudsburg: The Association for Computational Linguistics, s. 30-40 11 s.

Forskningsoutput: Kapitel i bok/rapport/konferenshandlingKonferensbidragVetenskapligPeer review

Öppen tillgång


  • 4 Akademiskt besök på HU
  • 3 Arrangemang av och deltagande i konferens/workshop/kurs/seminarium
  • 1 Typer av övriga aktiviteter - Extern undervisning och koordinering av ämne
  • 1 !!Oral presentation

András Kornai

Anssi Yli-Jyrä (Värd)

25 maj 20181 jun 2018

Aktivitet: Typer för att vara värd för en besökareAkademiskt besök på HU

The Power of Constraint Grammars Revisited

Anssi Yli-Jyrä (!!Speaker)

11 maj 2017

Aktivitet: Typer för tal eller presentation!!Oral presentation

Workshop on Multilingual and Cross-Lingual NLP

Anssi Yli-Jyrä (Närvarande)

11 feb 201612 feb 2016

Aktivitet: Typer för deltagande i eller organisering av evenemangArrangemang av och deltagande i konferens/workshop/kurs/seminarium