Title: Cross-lingual Similarity Calculation for Plagiarism Detection and More - Tools and Resources
Citation: CLEF 2012. Evaluation Labs and Workshop. Abstracts - Working Notes Papers
Publisher: CLEF
Publication Year: 2012
JRC N°: JRC73867
URI: http://ims-sites.dei.unipd.it/documents/71612/155385/CLEF2012wn-PAN-Steinberger2012.pdf
Type: Articles in periodicals and books
Abstract: A system that recognises cross-lingual plagiarism needs to establish – among other things – whether two pieces of text written in different languages are equivalent to each other. Potthast et al. (2010) give a thorough overview of this challenging task. While the Joint Research Centre (JRC) is not specifically concerned with plagiarism, it has been working for many years on developing other cross-lingual functionalities that may well be useful for the plagiarism detection task, i.e. (a) cross-lingual document similarity calculation, (b) subject domain profiling of documents in many different languages according to the same multilingual subject domain categorisation scheme, and (c) the recognition of name spelling variants for the same entity, both within the same language and across different languages and scripts. The speaker will explain the algorithms behind these software tools and he will present a number of freely available language resources that can be used to develop software with cross-lingual functionality.
JRC Directorate:Space, Security and Migration

Files in This Item:
There are no files associated with this item.

Items in repository are protected by copyright, with all rights reserved, unless otherwise indicated.