An official website of the European Union How do you know?      
European Commission logo
JRC Publications Repository Menu

Named Entity Recognition on Turkish Tweets

cover
Various recent studies show that the performance of named entity recognition (NER) systems developed for well-formed text types drops significantly when applied to tweets. The only existing study for the highly inflected agglutinative language Turkish reports a drop in F-Measure from 91% to 19% when ported from news articles to tweets. In this study, we present a new named entity-annotated tweet corpus and a detailed analysis of the various tweet-specific linguistic phenomena. We perform comparative NER experiments with a rule-based multilingual NER system adapted to Turkish on three corpora: a news corpus, our new tweet corpus, and another tweet corpus. Based on the analysis and the experimentation results, we suggest system features required to improve NER results for social media like Twitter.
2014-09-24
Association for Computational Linguistics (ACL)
JRC84941
978-2-9517408-8-4,   
http://www.lrec-conf.org/proceedings/lrec2014/pdf/380_Paper.pdf,    https://publications.jrc.ec.europa.eu/repository/handle/JRC84941,   
NameCountryCityType
Datasets
IDTitlePublic URL
Dataset collections
IDAcronymTitlePublic URL
Scripts / source codes
DescriptionPublic URL
Additional supporting files
File nameDescriptionFile type 
Show metadata record  Copy citation url to clipboard  Download BibTeX
Items published in the JRC Publications Repository are protected by copyright, with all rights reserved, unless otherwise indicated. Additional information: https://ec.europa.eu/info/legal-notice_en#copyright-notice