Event extraction for Balkan languages
We describe a system for real-time detection of security and crisis events from on-line news in three Balkan languages: Turkish, Romanian and Bulgarian. The system classifies the events according to a fine-grained event type set. It extracts structured information from news reports, by using a blend of keyword matching and finite-state grammars for entity recognition. We apply a multilingual methodology for the development of the system's language resources, based on adaptation of language-independent grammars and on weakly-supervised learning of lexical resources. Detailed performance evaluation proves that the approach is effective in developing real-world semantic processing applications for relatively less-resourced languages.
ZAVARELLA Vanni;
KUCUK Dilek;
TANEV Hristo;
HÜRRIYETOĞLU Ali;
2014-08-22
The Association for Computational Linguistics
JRC88244
978-1-937284-78-7,
Additional supporting files
File name | Description | File type | |