On New Text Corpora For Minority Languages On The Helsinki korp.csc.fi Server

Research output: Conference materialsPaperpeer-review

Abstract

The korp.csc.fi server in Finland provides text corpora of multiple varieties for numerous languages large and small. The Korp infrastructure is developed by the Swedish Språkbanken in the University and Gothenburg, and the source code is released under MIT license. Open nature of the systems makes it easily transferred into new environments, and there are already numerous Korp installations available. The one we discuss is maintained by the Language Bank of Finland.
Original languageEnglish
Pages32–36
Number of pages5
Publication statusPublished - 20 Dec 2019
MoE publication typeNot Eligible
EventЭлектронная письменность народов Российской Федерации: опыт, проблемы и перспективы - Ufa, Russian Federation
Duration: 27 Nov 201929 Nov 2019

Conference

ConferenceЭлектронная письменность народов Российской Федерации: опыт, проблемы и перспективы
CountryRussian Federation
CityUfa
Period27/11/201929/11/2019

Fields of Science

  • 6121 Languages
  • Minority languages
  • low-resourced languages
  • Uralic corpora
  • minimal data for Korp
  • FINCLARIN

Cite this