Mylly – The Mill: A platform for processing and analyzing your language data

Forskningsoutput: KonferensbidragSammanfattning

Sammanfattning

Mylly is a data analysis platform where language researchers can process their data in a graphical user interface. Users can populate their personal, persistent sessions in the Mylly workspace by importing local files (or files available on the web), by querying the Korp API directly from Mylly, and by processing their files in Mylly. The files resulting from each transformation can be examined directly in the user interface, or processed further, or exported locally. Mylly automatically tracks the workflow. Notes can be added, and workflows can be repeated on new files. Current tools include morphosyntactic analysis of plain text, automatic speech recognition for Finnish, finite-state transducer technology, conversions, some statistics, and a general relational toolkit. Tools to manipulate VRT documents (annotated tokenized text) are forthcoming.

Mylly is based on the open source Chipster platform, developed for bioinformatics at CSC – IT Center for Science. The previous Java client is being replaced with standard HTML5 technology that runs directly in a regular browser. The new version supports federated login such as HAKA or eduGAIN. The new backend uses OpenShift container technology that can distribute resources on virtual servers transparently and scalably. The current tools are being configured for the new Mylly implementation, which we expect to roll out in a few months.

Welcome to take a peek at the new Mylly and to discuss your wishes with us!

Find out more: https://www.kielipankki.fi/support/mylly
Originalspråkfinska
StatusPublicerad - 9 okt. 2018
MoE-publikationstypEj behörig
EvenemangCLARIN Annual Conference - Hotel Galilei, Pisa, Italien
Varaktighet: 8 okt. 201810 okt. 2018
https://www.clarin.eu/event/2018/clarin-annual-conference-2018-pisa-italy

Konferens

KonferensCLARIN Annual Conference
Land/TerritoriumItalien
OrtPisa
Period08/10/201810/10/2018
Internetadress

Vetenskapsgrenar

  • 6121 Språkvetenskaper
  • 6160 Övriga humanistiska vetenskaper
  • 113 Data- och informationsvetenskap

Citera det här