Title: Exploring Linguistic Features for Web Spam Detection: A Preliminary Study
Authors: SYDOW MarcinWEISS Dawid
Other Contributors: PISKORSKI JAKUB
Citation: Proceedings of the 4th International Workshop on Adversarial Information Retrieval on the Web p. 1-4
Publisher: ACM
Publication Year: 2008
JRC N°: JRC45828
URI: http://airweb.cse.lehigh.edu/2008; http://airweb.cse.lehigh.edu/2008/submissions/piskorski_2008_linguistic_analysis_spam.pdf
Type: Articles in periodicals and books
Abstract: We study the usability of linguistic features in theWeb spam classification task. The features were computed on two Web spam corpora: Webspam-Uk2006 and Webspam-Uk2007, we make them publicly available for other researchers. Preliminary analysis seems to indicate that certain linguistic features may be useful for the spam-detection task when combined with features studied elsewhere.
JRC Directorate:Space, Security and Migration

Files in This Item:
There are no files associated with this item.

Items in repository are protected by copyright, with all rights reserved, unless otherwise indicated.