Title: Acquisition and Use of Multilingual Name Dictionaries
Authors: POULIQUEN BRUNOSTEINBERGER RALFIGNAT Camelia
Citation: The Workshop on Acquisition and Management of Multilingual Lexicons - Proceedings p. 1-10
Publisher: Bulgarian Academy of Sciences
Publication Year: 2007
JRC N°: JRC45494
URI: http://pers-www.wlv.ac.uk/~in8113/amml07/#papers
http://publications.jrc.ec.europa.eu/repository/handle/JRC45494
Type: Contributions to Conferences
Abstract: We are presenting a method and a working system that automatically builds up a large multilingual dictionary of person and organisation names through daily news analysis and that makes use of this name dictionary - together with a gazetteer of location names and other means - to link related news articles across languages for 19 languages. Prominent features of the system are the simplicity of the approach (required to extend the functionality to so many languages), the fact that monolingual and cross-lingual name variants are automatically merged with the name's base form, and the fact that the system aggregates information about persons independently of the spelling of their name. The system, accessible online at http://press.jrc.it/NewsExplorer/, has currently collected over 630,000 different names with up to 140 variants for the same name from real life news, plus their inflections. We will put this work into the wider context of other text-related activities carried out at the European Commission¿s Joint Research Centre (JRC).
JRC Institute:Institute for the Protection and Security of the Citizen

Files in This Item:
There are no files associated with this item.


Items in repository are protected by copyright, with all rights reserved, unless otherwise indicated.