Projekt per år
Sammanfattning
This paper describes a plug-in component to extend the PULS
information extraction framework to analyze Russian-language
text. PULS is a comprehensive framework for information extraction
(IE) that is used for analysis of news in several scenarios from
English-language text and is primarily monolingual.
Although monolinguality is recognized as a serious limitation,
building an IE system for a new language from the bottom up is very
labor-intensive. Thus, the objective of the present work is to
explore whether the base framework can be extended to cover additional
languages with limited effort, and to leverage the pre-existing PULS
modules as far as possible, in order to accelerate the development
process.
The component for Russian analysis is described and its performance is
evaluated on two news-analysis scenarios: epidemic surveillance and
cross-border security. The approach described in the paper can be
generalized to a range of heavily-inflected languages.
information extraction framework to analyze Russian-language
text. PULS is a comprehensive framework for information extraction
(IE) that is used for analysis of news in several scenarios from
English-language text and is primarily monolingual.
Although monolinguality is recognized as a serious limitation,
building an IE system for a new language from the bottom up is very
labor-intensive. Thus, the objective of the present work is to
explore whether the base framework can be extended to cover additional
languages with limited effort, and to leverage the pre-existing PULS
modules as far as possible, in order to accelerate the development
process.
The component for Russian analysis is described and its performance is
evaluated on two news-analysis scenarios: epidemic surveillance and
cross-border security. The approach described in the paper can be
generalized to a range of heavily-inflected languages.
Originalspråk | engelska |
---|---|
Titel på värdpublikation | The 4th Biennial International Workshop on Balto-Slavic Natural Language Processing : ACL 2013 |
Utgivningsdatum | 2013 |
Sidor | 100-109 |
ISBN (tryckt) | 978-1-937284-59-6 |
Status | Publicerad - 2013 |
MoE-publikationstyp | A4 Artikel i en konferenspublikation |
Evenemang | The 4th Biennial International Workshop on Balto-Slavic Natural Language Processing - Sofia, Bulgarien Varaktighet: 8 aug. 2013 → 9 aug. 2013 |
Vetenskapsgrenar
- 113 Data- och informationsvetenskap
-
PULS
Yangarber, R. (Projektledare), Du, M. (Deltagare), Pivovarova, L. (Deltagare), Pierce, M. (Deltagare), von Etter, P. (Deltagare) & Huttunen, S. (Deltagare)
01/12/2007 → …
Projekt: Forskningsprojekt
-
LLL: Language Learning Lab
Yangarber, R. (Projektledare), Katinskaia, A. (Deltagare), Hou, J. (Deltagare), Furlan, G. (Deltagare) & Kylliäinen, I. P. (Deltagare)
Projekt: Forskningsprojekt