We give an overview of the highly multilingual news analysis system Europe Media Monitor (EMM), which gathers an average of 175,000 online news articles per day in tens of languages, categorises the news items and extracts named entities and various other information from them. We explain how users benefit from media monitoring and why it is so important to monitor the news in many different languages. We also describe the challenge of developing text mining tools for tens of languages and in particular that of dealing with highly inflected languages, such as those of the Balto-Slavonic and Finno-Ugric language families.
STEINBERGER Ralf;
2014-08-19
Spinger
JRC82905
0302-9743,
http://link.springer.com/chapter/10.1007/978-3-642-41057-4_1,
http://link.springer.com/book/10.1007/978-3-642-41057-4,
https://publications.jrc.ec.europa.eu/repository/handle/JRC82905,
10.1007/978-3-642-41057-4_1,
| Name | Country | City | Type |
|---|
This document is only visible at the Commission level.
You are not authorized to publish or distribute it outside the European Commission.
This is a public document. You can share this publication.
Datasets
| ID | Title | Public URL |
|---|
Dataset collections
| ID | Acronym | Title | Public URL |
|---|
Scripts / source codes
| Description | Public URL |
|---|
Additional supporting files
| File name | Description | File type |
|---|