Adapting the PULS Event Extraction Framework to Analyze Russian Text

Research output: Chapter in Book/Report/Conference proceedingConference contributionScientificpeer-review

Abstract

This paper describes a plug-in component to extend the PULS
information extraction framework to analyze Russian-language
text. PULS is a comprehensive framework for information extraction
(IE) that is used for analysis of news in several scenarios from
English-language text and is primarily monolingual.
Although monolinguality is recognized as a serious limitation,
building an IE system for a new language from the bottom up is very
labor-intensive. Thus, the objective of the present work is to
explore whether the base framework can be extended to cover additional
languages with limited effort, and to leverage the pre-existing PULS
modules as far as possible, in order to accelerate the development
process.
The component for Russian analysis is described and its performance is
evaluated on two news-analysis scenarios: epidemic surveillance and
cross-border security. The approach described in the paper can be
generalized to a range of heavily-inflected languages.
Original languageEnglish
Title of host publicationThe 4th Biennial International Workshop on Balto-Slavic Natural Language Processing : ACL 2013
Publication date2013
Pages100-109
ISBN (Print)978-1-937284-59-6
Publication statusPublished - 2013
MoE publication typeA4 Article in conference proceedings
EventThe 4th Biennial International Workshop on Balto-Slavic Natural Language Processing - Sofia, Bulgaria
Duration: 8 Aug 20139 Aug 2013

Fields of Science

  • 113 Computer and information sciences

Cite this