Pattern Learning for Event Extraction using Monolingual Statistical Machine Translation
Event extraction systems typically take advantage of language and domain-specific knowledge bases, including patterns that are used to identify specific facts in text; techniques to acquire these patterns can be considered one of the most challenging issues.
In this work, we propose a languageindependent and weakly-supervised algorithm to automatically discover linear patterns from texts. Our approach is based on a phrase-based statistical machine translation system trained on monolingual data.
A bootstrapping version of the algorithm is proposed. Our method was tested on patterns with different domain-specific semantic roles in three languages: English, Spanish and Russian. Performance evaluated on the extracted patterns and via the output of an event extraction system shows the feasibility of our approach and its capability of working with texts in various languages.
TURCHI Marco;
ZAVARELLA Vanni;
TANEV Hristo;
2012-01-26
Incoma Ltd.
JRC65777
1313-8502,
Additional supporting files
File name | Description | File type | |