Using Parallel corpora for Multilingual (Multi-Document) Summarisation Evaluation
We are presenting a method for the evaluation of multilingual
multi-document summarisation that allows saving precious annotation
time and that makes the evaluation results across languages directly
comparable. The approach is based on the manual selection of the most
important sentences in a cluster of documents from a sentence-aligned
parallel corpus, and by projecting the sentence selection to various target
languages. We also present two ways of exploiting inter-annotator
agreement levels, apply them both to a baseline sentence extraction summariser
in seven languages, and discuss the result differences between the
two evaluation versions, as well as between languages. The same method
can in principle be used to evaluate single-document summarisers or information
extraction tools.
TURCHI Marco;
STEINBERGER Josef;
KABADJOV Mijail;
STEINBERGER Ralf;
2010-10-20
Springer-Verlag
JRC59027
http://www.springerlink.com/content/l145412357783t24/,
https://publications.jrc.ec.europa.eu/repository/handle/JRC59027,
10.1007/978-3-642-15998-5_7,
Additional supporting files
| File name | Description | File type | |