Mylly – The Mill: A platform for processing and analyzing your language data

Research output: Conference materialsAbstract

Abstract

Mylly is a data analysis platform where language researchers can process their data in a graphical user interface. Users can populate their personal, persistent sessions in the Mylly workspace by importing local files (or files available on the web), by querying the Korp API directly from Mylly, and by processing their files in Mylly. The files resulting from each transformation can be examined directly in the user interface, or processed further, or exported locally. Mylly automatically tracks the workflow. Notes can be added, and workflows can be repeated on new files. Current tools include morphosyntactic analysis of plain text, automatic speech recognition for Finnish, finite-state transducer technology, conversions, some statistics, and a general relational toolkit. Tools to manipulate VRT documents (annotated tokenized text) are forthcoming.

Mylly is based on the open source Chipster platform, developed for bioinformatics at CSC – IT Center for Science. The previous Java client is being replaced with standard HTML5 technology that runs directly in a regular browser. The new version supports federated login such as HAKA or eduGAIN. The new backend uses OpenShift container technology that can distribute resources on virtual servers transparently and scalably. The current tools are being configured for the new Mylly implementation, which we expect to roll out in a few months.

Welcome to take a peek at the new Mylly and to discuss your wishes with us!

Find out more: https://www.kielipankki.fi/support/mylly
Original languageFinnish
Publication statusPublished - 9 Oct 2018
MoE publication typeNot Eligible
EventCLARIN Annual Conference - Hotel Galilei, Pisa, Italy
Duration: 8 Oct 201810 Oct 2018
https://www.clarin.eu/event/2018/clarin-annual-conference-2018-pisa-italy

Conference

ConferenceCLARIN Annual Conference
Country/TerritoryItaly
CityPisa
Period08/10/201810/10/2018
Internet address

Fields of Science

  • 6121 Languages
  • 6160 Other humanities
  • 113 Computer and information sciences

Cite this