Title: Exploring Linguistic Features for Web Spam Detection: A Preliminary Study
Authors: SYDOW MarcinWEISS Dawid
Other Contributors: PISKORSKI JAKUB
Citation: Proceedings of the 4th International Workshop on Adversarial Information Retrieval on the Web p. 1-4
Publisher: ACM
Publication Year: 2008
JRC N°: JRC45828
URI: http://airweb.cse.lehigh.edu/2008; http://airweb.cse.lehigh.edu/2008/submissions/piskorski_2008_linguistic_analysis_spam.pdf
http://publications.jrc.ec.europa.eu/repository/handle/JRC45828
Type: Contributions to Conferences
Abstract: We study the usability of linguistic features in theWeb spam classification task. The features were computed on two Web spam corpora: Webspam-Uk2006 and Webspam-Uk2007, we make them publicly available for other researchers. Preliminary analysis seems to indicate that certain linguistic features may be useful for the spam-detection task when combined with features studied elsewhere.
JRC Institute:Institute for the Protection and Security of the Citizen

Files in This Item:
There are no files associated with this item.


Items in repository are protected by copyright, with all rights reserved, unless otherwise indicated.