Abstract
Named entity recognition (NER) is a well-researched task in the field of NLP, which typically requires large annotated corpora for training usable models. This is a problem for languages which lack large annotated corpora, such as Finnish. We propose an approach to create a named entity recognizer with no annotated or parallel documents, by leveraging strong NER models that exist for English. We automatically gather a large amount of chronologically matched data in two languages, then project named entity annotations from the English documents onto the Finnish ones, by resolving the matches with limited linguistic rules. We use this “artificially” annotated data to train a BiLSTM-CRF model. Our results show that this method can produce annotated instances with high precision, and the resulting model achieves state-of-the-art performance.
Original language | English |
---|---|
Title of host publication | 22nd Nordic Conference on Computational Linguistics (NoDaLiDa) : Proceedings of the Conference |
Editors | Mareike Hartmann, Barbara Plank |
Number of pages | 10 |
Place of Publication | Linköping |
Publisher | Linköping University Electronic Press |
Publication date | Oct 2019 |
Pages | 232-241 |
ISBN (Electronic) | 978-91-7929-995-8 |
Publication status | Published - Oct 2019 |
MoE publication type | A4 Article in conference proceedings |
Event | Nordic Conference on Computational Linguistics - Turku, Finland Duration: 30 Sept 2019 → 2 Oct 2019 Conference number: 22 |
Publication series
Name | Linköping Electronic Conference Proceedings |
---|---|
Publisher | Linköping University Electronic Press |
Number | 67 |
ISSN (Print) | 1650-3686 |
ISSN (Electronic) | 1650-3740 |
Name | NEALT Proceedings Series |
---|---|
Publisher | Linköping University Electronic Press |
Number | 42 |
Fields of Science
- 113 Computer and information sciences
- 6121 Languages
Projects
-
LLL: Language Learning Lab
Yangarber, R. (Project manager), Katinskaia, A. (Participant), Hou, J. (Participant), Furlan, G. (Participant) & Kylliäinen, I. P. (Participant)
Project: Research project