• P.O. Box 24 (Unioninkatu 40 B)

    00014 University of Helsinki

    Finland

Publikationer

2020

A Finnish news corpus for named entity recognition

Ruokolainen, T., Kauppinen, P., Silfverberg, M. & Lindén, K., mar 2020, I : Language Resources and Evaluation. 54, 1, s. 247-272 26 s.

Forskningsoutput: TidskriftsbidragArtikelVetenskapligPeer review

An Evaluation Benchmark for Testing the Word Sense Disambiguation Capabilities of Machine Translation Systems

Raganato, A., Scherrer, Y. & Tiedemann, J., 1 maj 2020, Proceedings of The 12th Language Resources and Evaluation Conference. Marseille, France: European Language Resources Association (ELRA), s. 3668-3675 8 s.

Forskningsoutput: Kapitel i bok/rapport/konferenshandlingKonferensbidragVetenskapligPeer review

Öppen tillgång
Fil

Automated Phonological Transcription of Akkadian Cuneiform Text

Sahala, A., Silfverberg, M., Arppe, A. & Linden, K., 17 maj 2020, Proceedings of the 12th Conference on Language Resources and Evaluation (LREC 2020). Calzolari ... [ et al.], N. (red.). Paris: European Language Resources Association (ELRA), s. 3528-3934 7 s.

Forskningsoutput: Kapitel i bok/rapport/konferenshandlingKonferensbidragVetenskapligPeer review

Öppen tillgång
Fil

BabyFST: Towards a Finite-State Based Computational Model of Ancient Babylonian

Sahala, A., Silfverberg, M., Arppe, A. & Linden, K., 17 maj 2020, Proceedings of the 12th Conference on Language Resources and Evaluation (LREC 2020). Calzolari ... [et al.], N. (red.). Paris: European Language Resources Association (ELRA), s. 3886-3894 9 s.

Forskningsoutput: Kapitel i bok/rapport/konferenshandlingKonferensbidragVetenskapligPeer review

Öppen tillgång
Fil

Challenges in Annotation: Annotator Experiences from a Crowdsourced Emotion Annotation Task

Öhman, E., 2020, Digital Humanities in the Nordic Countries 2020. CEUR Workshop Proceedings

Forskningsoutput: Kapitel i bok/rapport/konferenshandlingKonferensbidragVetenskapligPeer review

Fil

Effects of Language Relatedness for Cross-lingual Transfer Learning in Character-Based Language Models

Singh, M., Smit, P., Virpioja, S. & Kurimo, M., 1 maj 2020, Proceedings of the 1st Joint Workshop on Spoken Language Technologies for Under-resourced languages (SLTU) and Collaboration and Computing for Under-Resourced Languages (CCURL). Marseille, France: European Language Resources Association (ELRA), s. 41-45 5 s.

Forskningsoutput: Kapitel i bok/rapport/konferenshandlingKonferensbidragVetenskapligPeer review

Öppen tillgång

Emotion Preservation in Translation: Evaluating Datasets for Annotation Projection

Kajava, K. S. A., Öhman, E. S., Hui, P. & Tiedemann, J., 2020, Digital Humanities in the Nordic Countries 2020. CEUR Workshop Proceedings

Forskningsoutput: Kapitel i bok/rapport/konferenshandlingKonferensbidragVetenskapligPeer review

Fil

Fear in Akkadian Texts: New Digital Perspectives on Lexical Semantics

Svärd, S., Alstola, T., Jauhiainen, H., Sahala, A. & Linden, K., 2020, (!!Accepted/In press) The Expression of Emotions in Ancient Egypt and Mesopotamia. Hsu, S-W. & Llop-Raduà, J. (red.). Leiden: Brill

Forskningsoutput: Kapitel i bok/rapport/konferenshandlingKapitelVetenskapligPeer review

FST Morphology for the Endangered Skolt Sami Language

Rueter, J. & Hämäläinen, M., 2020, Proceedings of the 1st Joint SLTU and CCURL Workshop (SLTU-CCURL 2020). Beermann, D., Besacier, L., Sakti, S. & Soria, C. (red.). Paris: European Language Resources Association (ELRA), s. 250-257 8 s.

Forskningsoutput: Kapitel i bok/rapport/konferenshandlingKonferensbidragVetenskapligPeer review

Öppen tillgång
Fil

HELFI: a Hebrew-Greek-Finnish Parallel Bible Corpus with Cross-Lingual Morpheme Alignment

Yli-Jyrä, A., Purhonen, J., Liljeqvist, M., Antturi, A., Nieminen, P., Räntilä, K. M. & Luoto, V., 16 mar 2020, LREC 2020, Eleventh International Conference on Language Resources and Evaluation. European Language Resources Association (ELRA), 8 s.

Forskningsoutput: Kapitel i bok/rapport/konferenshandlingKonferensbidragVetenskapligPeer review

Öppen tillgång
Fil

LT@Helsinki at SemEval-2020 Task 12: Multilingual or language-specific BERT?

Pàmies, M., Öhman, E., Kajava, K. & Tiedemann, J., 2020, (!!Accepted/In press) Proceedings of the International Workshop on Semantic Evaluation (SemEval).

Forskningsoutput: Kapitel i bok/rapport/konferenshandlingKonferensbidragVetenskapligPeer review

Morfessor EM+Prune: Improved Subword Segmentation with Expectation Maximization and Pruning

Grönroos, S-A., Virpioja, S. & Kurimo, M., 1 maj 2020, Proceedings of The 12th Language Resources and Evaluation Conference. Marseille, France: European Language Resources Association (ELRA), s. 3944-3953 10 s.

Forskningsoutput: Kapitel i bok/rapport/konferenshandlingKonferensbidragVetenskapligPeer review

Öppen tillgång

Morphological Disambiguation of South Sámi with FSTs and Neural Networks

Hämäläinen, M. & Wiechetek, L., 2020, Proceedings of the 1st Joint SLTU and CCURL Workshop (SLTU-CCURL 2020). European Language Resources Association (ELRA), s. 36-40

Forskningsoutput: Kapitel i bok/rapport/konferenshandlingKonferensbidragVetenskapligPeer review

Öppen tillgång
Fil

MT for subtitling: User evaluation of post-editing productivity

Koponen, M., Sulubacak, U., Vitikainen, K. & Tiedemann, J., 10 jun 2020, Proceedings of the 22nd Annual Conference of the European Association for Machine Translation (EAMT 2020). Martins, A., Moniz, H., Fumega, S., Martins, B., Batista, F., Coheur, L., Parra, C., Trancoso, I., Turchi, M., Bisazza, A., Moorkens, J., Guerberof, A., Nurminen, M., Marg, L. & Forcada, M. L. (red.). Lisbon, Portugal: European Association for Machine Translation, Vol. 1. s. 115-124 10 s.

Forskningsoutput: Kapitel i bok/rapport/konferenshandlingKonferensbidragVetenskapligPeer review

Öppen tillgång
Fil

Multimodal Machine Translation through Visuals and Speech

Sulubacak, U., Caglayan, O., Grönroos, S-A., Rouhe, A., Elliott, D., Specia, L. & Tiedemann, J., 9 mar 2020, (!!Accepted/In press) I : Machine Translation. 34 s.

Forskningsoutput: TidskriftsbidragArtikelVetenskapligPeer review

Fil

On Editing Dictionaries for Uralic Languages in an Online Environment

Alnajjar, K., Hämäläinen, M. & Rueter, J., 2020, Proceedings of the Sixth International Workshop on Computational Linguistics of Uralic Languages. Stroudsburg, PA: The Association for Computational Linguistics, s. 26–30

Forskningsoutput: Kapitel i bok/rapport/konferenshandlingKonferensbidragVetenskapligPeer review

Öppen tillgång
Fil

On Practical Realisation of Autosegmental Representations in Lexical Transducers of Tonal Bantu Languages

Yli-Jyrä, A., 13 jan 2020, (Insänt) LT4ALL. UNESCO, 4 s.

Forskningsoutput: Kapitel i bok/rapport/konferenshandlingKonferensbidragVetenskapligPeer review

OpusTools and Parallel Corpus Diagnostics

Aulamo, M., Sulubacak, U., Virpioja, S. & Tiedemann, J., 17 maj 2020, Proceedings of the 12th Language Resource and Evaluation Conference. Calzolari, N., Béchet, F., Blache, P., Choukri, K., Cieri, C., Declerck, T., Goggi, S., Isahara, H., Maegaard, B., Mariani, J., Mazo, H., Moreno, A., Odijk, J. & Piperidis, S. (red.). Marseille, France: European Language Resources Association (ELRA), s. 3775 3782 s.

Forskningsoutput: Kapitel i bok/rapport/konferenshandlingKonferensbidragVetenskapligPeer review

Öppen tillgång
Fil

Paraphrase Generation and Evaluation on Colloquial-Style Sentences

Sjöblom, E., Creutz, M. & Scherrer, Y., 1 maj 2020, Proceedings of The 12th Language Resources and Evaluation Conference. Marseille, France: European Language Resources Association (ELRA), s. 1814-1822 9 s.

Forskningsoutput: Kapitel i bok/rapport/konferenshandlingKonferensbidragVetenskapligPeer review

Öppen tillgång

Raamatun jakeita uralilaisille kielille: , rinnakkaiskorpus, sekoitettu, Korp [tekstikorpus].

Rueter, J. & Axelson, E., feb 2020

Forskningsoutput: Icke-textbaserad outputProgramvaraVetenskaplig

Skolt Sami, the makings of a pluricentric language, where does it stand?

Rueter, J. & Hämäläinen, M., 2020, European Pluricentric Languages in Contact and Conflict . Muhr, R., Mas Castells, J. A. & Rueter, J. (red.). Bern: Peter Lang, 12. (Österreichisches Deutsch – Sprache der Gegenwart; nr. 21).

Forskningsoutput: Kapitel i bok/rapport/konferenshandlingKapitelVetenskapligPeer review

TaPaCo: A Corpus of Sentential Paraphrases for 73 Languages

Scherrer, Y., 1 maj 2020, Proceedings of The 12th Language Resources and Evaluation Conference. Marseille, France: European Language Resources Association (ELRA), s. 6868-6873 6 s.

Forskningsoutput: Kapitel i bok/rapport/konferenshandlingKonferensbidragVetenskapligPeer review

Öppen tillgång
Fil

The University of Helsinki Submission to the IWSLT2020 Offline Speech Translation Task

Vázquez, R., Aulamo, M., Sulubacak, U. & Tiedemann, J., 24 apr 2020, (Insänt) Proceedings of the 17th International Conference on Spoken Language Translation (IWSLT). Seattle, WA, USA

Forskningsoutput: Kapitel i bok/rapport/konferenshandlingKonferensbidragVetenskapligPeer review

Fil

Wrangling with non-standard data

Mäkelä, E., Lagus, K., Lahti, L., Säily, T., Tolonen, M., Hämäläinen, M., Kaislaniemi, S. & Nevalainen, T., 2020, Proceedings of the Digital Humanities in the Nordic Countries 5th Conference: Riga, Latvia, October 21-23, 2020. Reinsone, S., Skadiņa, I., Baklāne, A. & Daugavietis, J. (red.). Aachen: CEUR-WS.org, s. 81-96 16 s. (CEUR Workshop Proceedings; vol. 2612).

Forskningsoutput: Kapitel i bok/rapport/konferenshandlingKonferensbidragVetenskapligPeer review

Öppen tillgång
Fil
2019

A Creative Dialog Generator for Fallout 4

Alnajjar, K. & Hämäläinen, M., 2019, Proceedings of the 14th International Conference on the Foundations of Digital Games. New York: ACM, 4 s. 48

Forskningsoutput: Kapitel i bok/rapport/konferenshandlingKonferensbidragVetenskapligPeer review

Öppen tillgång
Fil

A Derivational Model of Discontinuous Parsing

Yli-Jyrä, A. & Nederhof, M-J., 2019, (!!Accepted/In press) I : Information and Computation.

Forskningsoutput: TidskriftsbidragArtikelVetenskapligPeer review

Analysing concatenation approaches to document-level NMT in two different domains

Scherrer, Y., Tiedemann, J. & Loáiciga, S., 1 nov 2019, The Fourth Workshop on Discourse in Machine Translation: Proceedings of the Workshop. Stroudsburg: The Association for Computational Linguistics, s. 51-61 11 s.

Forskningsoutput: Kapitel i bok/rapport/konferenshandlingKonferensbidragVetenskapligPeer review

Öppen tillgång
Fil

An Evaluation of Language-Agnostic Inner-Attention-Based Representations in Machine Translation

Raganato, A., Vázquez, R., Creutz, M. & Tiedemann, J., 1 aug 2019, The 4th Workshop on Representation Learning for NLP (RepL4NLP-2019): Proceedings of the Workshop. Augenstein, I., Gella, S., Ruder, S., Kann, K., Can, B., Welbl, J., Conneau, A., Ren, X. & Rei, M. (red.). Stroudsburg: The Association for Computational Linguistics, s. 27-32 6 s.

Forskningsoutput: Kapitel i bok/rapport/konferenshandlingKonferensbidragVetenskapligPeer review

Öppen tillgång
Fil

Annotation of subtitle paraphrases using a new web tool

Aulamo, M. J., Creutz, M. J. P. & Sjöblom, E. I., 17 maj 2019, Digital Humanities in the Nordic Countries: Proceedings of the Digital Humanities in the Nordic Countries 4th Conference. Navarretta, C., Agirrezabal, M. & Maegaard, B. (red.). Aachen: CEUR-WS.org, s. 33-48 16 s. (CEUR Workshop Proceedings ; nr. 2364).

Forskningsoutput: Kapitel i bok/rapport/konferenshandlingKonferensbidragVetenskapligPeer review

Öppen tillgång
Fil

An Open Online Dictionary for Endangered Uralic Languages

Hämäläinen, M. & Rueter, J., 2019, Electronic lexicography in the 21st century: Proceedings of the eLex 2019 conference. Kosem, I., Zingano Kuhn, T., Correia, M., Ferreira, J. P., Jansen, M., Pereira, I., Kallas, J., Jakubíček, M., Krek, S. & Tiberius, C. (red.). Brno: Lexical Computing CZ s.r.o., s. 819-830 12 s. (Electronic lexicography in the 21st century).

Forskningsoutput: Kapitel i bok/rapport/konferenshandlingKonferensbidragVetenskapligPeer review

Öppen tillgång
Fil

ArchiMob: Ein multidialektales Korpus schweizerdeutscher Spontansprache

Scherrer, Y., Samardžić, T. & Glaser, E., 1 nov 2019, I : Linguistik Online. 98, 5, s. 425-454 30 s.

Forskningsoutput: TidskriftsbidragArtikelVetenskapligPeer review

Öppen tillgång
Fil

A Report on the Third VarDial Evaluation Campaign

Zampieri, M., Malmasi, S., Scherrer, Y., Samardžic, T., Tyers, F., Silfverberg, M. P., Klyueva, N., Pan, T-L., Huang, C-R., Ionescu, R. T., Butnaru, A. & Jauhiainen, T. S., 2019, Proceedings of the . Zampieri, M., Nakov, P., Malmasi, S., Ljubešić, N., Tiedemann, J. & Ali, A. (red.). Stroudsburg: The Association for Computational Linguistics, s. 1-16 16 s.

Forskningsoutput: Kapitel i bok/rapport/konferenshandlingKonferensbidragVetenskaplig

Öppen tillgång
Fil

Aššur and His Friends: A Statistical Analysis of Neo-Assyrian Texts

Alstola, T., Zaia, S., Sahala, A., Jauhiainen, H., Svärd, S. & Linden, K., 2019, I : Journal of Cuneiform Studies. 71, s. 159-180 22 s.

Forskningsoutput: TidskriftsbidragArtikelVetenskapligPeer review

Öppen tillgång
Fil

A Template Based Approach for Training NMT for Low-Resource Uralic Languages - A Pilot with Finnish

Hämäläinen, M. & Alnajjar, K., dec 2019, ACAI 2019: Proceedings of the 2019 2nd International Conference on Algorithms, Computing and Artificial Intelligence. ACM, s. 520-525

Forskningsoutput: Kapitel i bok/rapport/konferenshandlingKonferensbidragVetenskapligPeer review

Öppen tillgång
Fil

Automatic Language Identification in Texts: A Survey

Jauhiainen, T., Lui, M., Zampieri, M., Baldwin, T. & Lindén, K., 25 aug 2019, I : Journal of Artificial Intelligence Research. 65, s. 675-782 108 s.

Forskningsoutput: TidskriftsbidragArtikelVetenskapligPeer review

Öppen tillgång
Fil

Constraint Grammar As a Hand-Crafted Transformer

Yli-Jyrä, A., 2019.

Forskningsoutput: KonferensbidragKonferenspapper

Constraint Grammar is a hand-crafted Transformer

Yli-Jyrä, A., 3 dec 2019, Proceedings of the NoDaLiDa 2019 Workshop on Constraint Grammar - Methods, Tools and Applications, 30 September 2019, Turku, Finland. Bick, E. & Trosterud, T. (red.). Linköping: Linköping University Electronic Press, s. 45-49 5 s. 9. (NEALT Proceedings Series; nr. 33)(Linköping Electronic Conference Proceedings; nr. 168).

Forskningsoutput: Kapitel i bok/rapport/konferenshandlingKonferensbidragVetenskapligPeer review

Öppen tillgång
Fil

Co-Operation as an Asymmetric Form of Human-Computer Creativity. Case: Peace Machine

Hämäläinen, M. & Honkela, T., 2019, Proceedings of the First Workshop on NLP for Conversational AI. Stroudsburg: The Association for Computational Linguistics, s. 42–50 9 s.

Forskningsoutput: Kapitel i bok/rapport/konferenshandlingKonferensbidragVetenskapligPeer review

Öppen tillgång
Fil

Creative Contextual Dialog Adaptation in an Open World RPG

Hämäläinen, M. & Alnajjar, K., 2019, Proceedings of the 14th International Conference on the Foundations of Digital Games. New York: ACM, 7 s. 73

Forskningsoutput: Kapitel i bok/rapport/konferenshandlingKonferensbidragVetenskapligPeer review

Öppen tillgång
Fil

Dialect Text Normalization to Normative Standard Finnish

Partanen, N., Hämäläinen, M. & Alnajjar, K., 2019, The Fifth Workshop on Noisy User-generated Text (W-NUT 2019): Proceedings of the Workshop. Xu, W., Ritter, A., Baldwin, T. & Rahimi, A. (red.). Stroudsburg: The Association for Computational Linguistics, s. 141–146 6 s.

Forskningsoutput: Kapitel i bok/rapport/konferenshandlingKonferensbidragVetenskapligPeer review

Öppen tillgång
Fil

Digitising Swiss German: how to process and study a polycentric spoken language

Scherrer, Y., Samardžić, T. & Glaser, E., 29 nov 2019, I : Language Resources and Evaluation. 53, 4, s. 735-769 35 s.

Forskningsoutput: TidskriftsbidragArtikelVetenskapligPeer review

Öppen tillgång
Fil

Discriminating between Mandarin Chinese and Swiss-German varieties using adaptive language models

Jauhiainen, T. S., Jauhiainen, H. A. & Linden, B. K. J., 30 apr 2019, Proceedings of the Sixth Workshop on NLP for Similar Languages, Varieties and Dialects (VarDial 2019) . Stroudsburg: The Association for Computational Linguistics, s. 178-187 10 s.

Forskningsoutput: Kapitel i bok/rapport/konferenshandlingKonferensbidragVetenskapligPeer review

Öppen tillgång
Fil

Emerging Paradigm of Bibliographic Data Science

Vaara, V., Ijaz, A., Tiihonen, I. L. I., Kanner, A., Säily, T. & Lahti, L., 2019.

Forskningsoutput: KonferensbidragSammanfattning

Öppen tillgång
Öppen tillgång
Fil

Finding Sami Cognates with a Character-Based NMT Approach

Hämäläinen, M. & Rueter, J., 2019, Proceedings of the 3rd Workshop on Computational Methods in the Study of Endangered Languages: (Volume 1) Papers. Arppe, A., Good, J., Hulden, M., Lachler, J., Palmer, A., Schwartz, L. & Silfverberg, M. (red.). Stroudsburg: The Association for Computational Linguistics, s. 39-45 7 s.

Forskningsoutput: Kapitel i bok/rapport/konferenshandlingKonferensbidragVetenskapligPeer review

Öppen tillgång
Fil

Forgotten Islands of Regularity in Phonology

Yli-Jyrä, A. M., 2019, (!!Accepted/In press) Festschrift.... 18 s.

Forskningsoutput: Kapitel i bok/rapport/konferenshandlingKapitelVetenskapligPeer review

From the Paft to the Fiiture: a Fully Automatic NMT and Word Embeddings Method for OCR Post-Correction

Hämäläinen, M. & Hengchen, S., 2019, Proceedings of Recent Advances in Natural Language Processing. Angelova, G., Mitkov, R., Nikolova, I. & Temnikova, I. (red.). Shoumen: INCOMA, s. 432-437 6 s.

Forskningsoutput: Kapitel i bok/rapport/konferenshandlingKonferensbidragVetenskapligPeer review

Öppen tillgång
Fil

Generating Modern Poetry Automatically in Finnish

Hämäläinen, M. & Alnajjar, K., 2019, 2019 Conference on Empirical Methods in Natural Language Processing and 9th International Joint Conference on Natural Language Processing: Proceedings of the Conference. Inui, K., Jiang, J., Ng, V. & Wan, X. (red.). Stroudsburg: The Association for Computational Linguistics, s. 6001–6006 6 s.

Forskningsoutput: Kapitel i bok/rapport/konferenshandlingKonferensbidragVetenskapligPeer review

Öppen tillgång
Fil