• PL 24 (Unioninkatu 40)

    00014

    Finland

20042023

Research activity per year

If you made any changes in Pure these will be visible here soon.

Personal profile

Description of research and teaching

Curriculum vitae

I work as professor of language technology at the Department of Digital Humanities at the University of Helsinki. My main research interest is in cross-lingual NLP and machine translation.

  • Since August 2015: Professor of Language Technology at the Department of Digital Humanities / HELDIG (formerly at the Department of Modern Languages), University of Helsinki
  • September 2014 – July 2015: Senior Researcher at the Department of Linguistics and Philology, Uppsala University
  • September 2009 – August 2014: Visiting Professor at the Department of Linguistics and Philology, Uppsala University
  • September 2004 – August 2009: PostDoc researcher at the Department of Information Science/Humanities Computing (Informatiekunde), University of Groningen
  • January 2004 – August 2004: Lecturer in computational linguistics and coordinator for the language technology programme, Department of Linguistics and Philology, Uppsala University
  • 2000 – 2003: Ph.D. research at the Department of Linguistics, Uppsala University
  • 2001 – 2002: Visiting Ph.D. student, Division of Informatics, Edinburgh University, UK
  • 1997 – 1999: Research assistent, Department of Linguistics, Uppsala University
  • 1991 – 1997: Masters in Computer Science (Diplom für Informatik), “Otto-von-Guericke” University, Magdeburg, Germany

Recent Projects

Resources and Tools

  • OPUS – a collection of freely available parallel corpora and tools
  • fiskmö translator – a translation demo for the Nordic languages
  • efmaral and eflomal – tools for efficient word alignment
  • WMT en-fi 20162017: official MT test sets for Finnish-English
  • HNMT – the Helsinki Neural Machine Translation system
  • Lingua::Align – a toolbox for tree-to-tree alignment
  • Uplug – a toolbox for processing parallel corpora
  • Lingua::Ident::Blacklists – language identifier for related languages
  • Docent – a document-level SMT decoder
  • pdf2xml – a converter for PDF documents
  • subalign – tools for converting and aligning movie subtitles
  • Helsinki-NLP at github and bitbucket

Active PhD Students

Former PhD Students

Education/Academic qualification

Computational Linguistics, PhD, Uppsala University

20002003

Award Date: 12 Dec 2003

Computer Science, M.Sc., Otto Von Guericke University, Magdeburg

19911997

Award Date: 11 Sep 1997

Fields of Science

  • 6121 Languages
  • Computational Linguistics
  • machine translation
  • 113 Computer and information sciences
  • language technology
  • machine learning
  • Natural language processing
  • Artificial intelligence

International and National Collaboration

Publications and projects within past five years.