Building and use of language technology (BAULT)

Description

PUBLIC DESCRIPTION:
The aim of the research community (RC) is to make language resources (i.e. materials and tools) seamlessly available for the researchers, develop and facilitate new types of language oriented research and develop new multilingual computational methods for processing language materials.

The group includes the FIN-CLARIN project which is listed in the national roadmap of the 20 new infrastructures to be built and is a component in the European CLARIN (Common Language Resource and Technology Infrastructure) of the ESFRI roadmap. The services of FIN-CLARIN, including the Language Bank of Finland are hosted by the CSC IT Center for Science, Ltd.

FIN-CLARIN and similar services are used for linguistic research when studying language structure, its use and variation and when developing language technological applications such as spellers, parsers and machine translation.

In particular, the University of Helsinki is widely known of the leading edge research on finite-state parsing methods. The morphological two-level model has some 1000 references in Google Scholar. The recent Helsinki Finite State Technology (HFST) project has combined several international research groups to cooperate under the HFST platform.

Practically all members of the RC participate (or have participated) as supervisors or students in the Langnet national PhD graduate school. Another national graduate school, the Finnish Graduate School of Language Technology (2002-2009) was participated by many members of the RC and was merged with Langnet in 2009. Many linguistic PhD students make use of the language resources whereas language technology students develop methods for natural language processing. All UH PhD and MA students of Russian are taught to use sophisticated language databases such as Integrum and Russian National Corpus.

Responsible person: Kimmo Koskenniemi, Department of Modern Languages

Participation category: 3
StatusFinished
Effective start/end date01/01/200531/12/2016