Adapting the PULS Event Extraction Framework to Analyze Russian Text

Forskningsoutput: Kapitel i bok/rapport/konferenshandlingKonferensbidragVetenskapligPeer review

Sammanfattning

This paper describes a plug-in component to extend the PULS
information extraction framework to analyze Russian-language
text. PULS is a comprehensive framework for information extraction
(IE) that is used for analysis of news in several scenarios from
English-language text and is primarily monolingual.
Although monolinguality is recognized as a serious limitation,
building an IE system for a new language from the bottom up is very
labor-intensive. Thus, the objective of the present work is to
explore whether the base framework can be extended to cover additional
languages with limited effort, and to leverage the pre-existing PULS
modules as far as possible, in order to accelerate the development
process.
The component for Russian analysis is described and its performance is
evaluated on two news-analysis scenarios: epidemic surveillance and
cross-border security. The approach described in the paper can be
generalized to a range of heavily-inflected languages.
Originalspråkengelska
Titel på värdpublikationThe 4th Biennial International Workshop on Balto-Slavic Natural Language Processing : ACL 2013
Utgivningsdatum2013
Sidor100-109
ISBN (tryckt)978-1-937284-59-6
StatusPublicerad - 2013
MoE-publikationstypA4 Artikel i en konferenspublikation
EvenemangThe 4th Biennial International Workshop on Balto-Slavic Natural Language Processing - Sofia, Bulgarien
Varaktighet: 8 aug. 20139 aug. 2013

Vetenskapsgrenar

  • 113 Data- och informationsvetenskap

Citera det här