Title: Creation and use of multilingual named entity variant dictionaries
Publisher: Editions Modulaires Européennes
Publication Year: 2014
JRC N°: JRC91623
ISBN: 978-2-8066-1144-4
ISSN: 0771-6524
URI: https://www.eme-editions.be/en/cahiers-de-linguistique/53286-40-2-2014-9782806611413.html
Type: Articles in periodicals and books
Abstract: The highly multilingual media analysis application Europe Media Monitor (EMM) makes extensive use of name dictionaries, including not only large lists of person, organisation and location names, but also many spelling variants for the same named entity, both within the same language and across languages and scripts. As EMM could not operate without these non-traditional dictionaries, we wish to make a strong case in their favour. In this chapter, we will explain how such vocabulary lists are used within EMM and how they were produced automatically by analysing over 100,000 news articles per day in over twenty languages. A large part of EMM’s vocabulary lists is made publicly available for download as part of JRC-Names.
JRC Directorate:Space, Security and Migration

Files in This Item:
There are no files associated with this item.

Items in repository are protected by copyright, with all rights reserved, unless otherwise indicated.