Projekteja vuodessa
Abstrakti
This paper describes a plug-in component to extend the PULS
information extraction framework to analyze Russian-language
text. PULS is a comprehensive framework for information extraction
(IE) that is used for analysis of news in several scenarios from
English-language text and is primarily monolingual.
Although monolinguality is recognized as a serious limitation,
building an IE system for a new language from the bottom up is very
labor-intensive. Thus, the objective of the present work is to
explore whether the base framework can be extended to cover additional
languages with limited effort, and to leverage the pre-existing PULS
modules as far as possible, in order to accelerate the development
process.
The component for Russian analysis is described and its performance is
evaluated on two news-analysis scenarios: epidemic surveillance and
cross-border security. The approach described in the paper can be
generalized to a range of heavily-inflected languages.
information extraction framework to analyze Russian-language
text. PULS is a comprehensive framework for information extraction
(IE) that is used for analysis of news in several scenarios from
English-language text and is primarily monolingual.
Although monolinguality is recognized as a serious limitation,
building an IE system for a new language from the bottom up is very
labor-intensive. Thus, the objective of the present work is to
explore whether the base framework can be extended to cover additional
languages with limited effort, and to leverage the pre-existing PULS
modules as far as possible, in order to accelerate the development
process.
The component for Russian analysis is described and its performance is
evaluated on two news-analysis scenarios: epidemic surveillance and
cross-border security. The approach described in the paper can be
generalized to a range of heavily-inflected languages.
Alkuperäiskieli | englanti |
---|---|
Otsikko | The 4th Biennial International Workshop on Balto-Slavic Natural Language Processing : ACL 2013 |
Julkaisupäivä | 2013 |
Sivut | 100-109 |
ISBN (painettu) | 978-1-937284-59-6 |
Tila | Julkaistu - 2013 |
OKM-julkaisutyyppi | A4 Artikkeli konferenssijulkaisuussa |
Tapahtuma | The 4th Biennial International Workshop on Balto-Slavic Natural Language Processing - Sofia, Bulgaria Kesto: 8 elok. 2013 → 9 elok. 2013 |
Tieteenalat
- 113 Tietojenkäsittely- ja informaatiotieteet
-
PULS
Yangarber, R. (Projektinjohtaja), Du, M. (Osallistuja), Pivovarova, L. (Osallistuja), Pierce, M. (Osallistuja), von Etter, P. (Osallistuja) & Huttunen, S. (Osallistuja)
01/12/2007 → …
Projekti: Tutkimusprojekti
-
LLL: Language Learning Lab
Yangarber, R. (Projektinjohtaja), Katinskaia, A. (Osallistuja), Hou, J. (Osallistuja), Furlan, G. (Osallistuja) & Kylliäinen, I. P. (Osallistuja)
Projekti: Tutkimusprojekti