An official website of the European Union How do you know?      
European Commission logo
JRC Publications Repository Menu

Multilingual Statistical News Summarization

cover
In this chapter we present a generic approach for summarizing clusters of multilingual news articles such as the ones produced by the Europe Media Monitor (EMM) system. Our approach uses robust statistical techniques as well as multilingual tools for named entity recognition and disambiguation to produce entity-centered summaries. We run experiments with the TAC 2008 and 2009 data sets (English corpora for summarization research), and we obtained very promising results; at TAC 2009 our runs attained top rank for linguistic quality and second best for overall responsiveness. We also run a small-scale evaluation on languages other than English, demonstrating thereby the multilinguality of our approach, but also providing interesting evidence that contradicts the pervasive assumption “if it works for English, it works for any language”. Finally, we present an online system currently under development which will eventually incorporate all the elements of the summarization approach discussed hereby and we show sample output summaries in various languages.
2013-06-06
Springer Verlag
JRC65730
978-3-642-28569-1,    978-3-642-28568-4 (print),   
2192-032X (print),    2192-0338,   
http://dx.doi.org/10.1007/978-3-642-28569-1_11,    https://publications.jrc.ec.europa.eu/repository/handle/JRC65730,   
10.1007/978-3-642-28569-1_11,   
Language Citation
NameCountryCityType
Datasets
IDTitlePublic URL
Dataset collections
IDAcronymTitlePublic URL
Scripts / source codes
DescriptionPublic URL
Additional supporting files
File nameDescriptionFile type 
Show metadata record  Copy citation url to clipboard  Download BibTeX
Items published in the JRC Publications Repository are protected by copyright, with all rights reserved, unless otherwise indicated. Additional information: https://ec.europa.eu/info/legal-notice_en#copyright-notice