Cross-lingual Linking of Multi-word Entities and language-dependent learning of Multi-word Entity Patterns
In this chapter, we present our contribution in addressing multi-word entity (MWEntity) recognition in a highly multilingual environment. The first part of this contribution describes completed work on recognising MWEntities in large volumes of text in 22 different languages, on identifying monolingual variants for the same entity and on linking the equivalent groups of variants across all languages. The second part describes our ongoing work on learning MWEntity recognition rules based on the already recognised MWEntities. We then show how such rules can improve the recognition of new or unknown MWEntities. The purpose of our effort is to improve on current methods for Named Entity Recognition (NER) in order to turn free text into semi-structured data that can be used for improved search, for linking related news over time and across languages, for trend detection and – more generally – for a more advanced intelligent analysis of large volumes of multilingual text collections.
JACQUET Guillaume;
EHRMANN Maud;
PISKORSKI Jakub;
TANEV Hristo;
STEINBERGER Ralf;
2017-12-19
Language Science Press
JRC105301
Additional supporting files
File name | Description | File type | |