Projects per year
Abstract
This paper describes a plug-in component to extend the PULS
information extraction framework to analyze Russian-language
text. PULS is a comprehensive framework for information extraction
(IE) that is used for analysis of news in several scenarios from
English-language text and is primarily monolingual.
Although monolinguality is recognized as a serious limitation,
building an IE system for a new language from the bottom up is very
labor-intensive. Thus, the objective of the present work is to
explore whether the base framework can be extended to cover additional
languages with limited effort, and to leverage the pre-existing PULS
modules as far as possible, in order to accelerate the development
process.
The component for Russian analysis is described and its performance is
evaluated on two news-analysis scenarios: epidemic surveillance and
cross-border security. The approach described in the paper can be
generalized to a range of heavily-inflected languages.
information extraction framework to analyze Russian-language
text. PULS is a comprehensive framework for information extraction
(IE) that is used for analysis of news in several scenarios from
English-language text and is primarily monolingual.
Although monolinguality is recognized as a serious limitation,
building an IE system for a new language from the bottom up is very
labor-intensive. Thus, the objective of the present work is to
explore whether the base framework can be extended to cover additional
languages with limited effort, and to leverage the pre-existing PULS
modules as far as possible, in order to accelerate the development
process.
The component for Russian analysis is described and its performance is
evaluated on two news-analysis scenarios: epidemic surveillance and
cross-border security. The approach described in the paper can be
generalized to a range of heavily-inflected languages.
Original language | English |
---|---|
Title of host publication | The 4th Biennial International Workshop on Balto-Slavic Natural Language Processing : ACL 2013 |
Publication date | 2013 |
Pages | 100-109 |
ISBN (Print) | 978-1-937284-59-6 |
Publication status | Published - 2013 |
MoE publication type | A4 Article in conference proceedings |
Event | The 4th Biennial International Workshop on Balto-Slavic Natural Language Processing - Sofia, Bulgaria Duration: 8 Aug 2013 → 9 Aug 2013 |
Fields of Science
- 113 Computer and information sciences
-
PULS
Yangarber, R. (Project manager), Du, M. (Participant), Pivovarova, L. (Participant), Pierce, M. (Participant), von Etter, P. (Participant) & Huttunen, S. (Participant)
01/12/2007 → …
Project: Research project
-
LLL: Language Learning Lab
Yangarber, R. (Project manager), Katinskaia, A. (Participant), Hou, J. (Participant), Furlan, G. (Participant) & Kylliäinen, I. P. (Participant)
Project: Research project