Projects per year
Abstract
The Helsinki-NLP team participated in the NADI 2023 shared tasks on Arabic dialect translation with seven submissions. We used statistical (SMT) and neural machine translation (NMT) methods and explored character- and subword-based data preprocessing. Our submissions placed second in both tracks. In the open track, our winning submission is a character-level SMT system with additional Modern Standard Arabic language models. In the closed track, our best BLEU scores were obtained with the leave-as-is baseline, a simple copy of the input, and narrowly followed by SMT systems. In both tracks, fine-tuning existing multilingual models such as AraT5 or ByT5 did not yield superior performance compared to SMT.
Original language | English |
---|---|
Title of host publication | Proceedings of the The First Arabic Natural Language Processing Conference (ArabicNLP 2023) |
Editors | Hassan Sawaf, Samhaa El-Beltagy, Wajdi Zaghouani, Walid Magdy, Ahmed Abdelali, Nadi Tomeh, Ibrahim Abu Farha, Nizar Habash, Salam Khalifa, Amr Keleg, Hatem Haddad, Imed Zitouni, Khalil Mrini, Rawan Almatham |
Number of pages | 8 |
Place of Publication | Stroudsburg |
Publisher | The Association for Computational Linguistics |
Publication date | 1 Dec 2023 |
Pages | 670-677 |
ISBN (Electronic) | 978-1-959429-27-2 |
DOIs | |
Publication status | Published - 1 Dec 2023 |
MoE publication type | A4 Article in conference proceedings |
Event | Arabic Natural Language Processing Conference - , Singapore Duration: 7 Dec 2023 → 7 Dec 2023 https://arabicnlp2023.sigarab.org/ |
Fields of Science
- 113 Computer and information sciences
- 6121 Languages
Projects
- 1 Active
-
CorCoDial: Corpus-based computational dialectology: exploiting machine translation techniques to extract, visualize and interpret dialectal patterns
Scherrer, Y. (Project manager), Tiedemann, J. (Project manager), Kuparinen, O. V. (Participant), Mickus, T. (Participant), Miletic Haddad, A. (Participant), Psaltaki, E. (Participant), Roemling, D. (Participant), Siewert, J. (Participant) & Siewert, J. (Participant)
Suomen Akatemia Projektilaskutus
01/09/2021 → 31/08/2025
Project: Research Council of Finland: Academy Project