Title: Text Categorization Using Bibliographic Records - Beyond Document Content
Authors: MONTEJO-RAEZ ArturoURENA-LOPEZ L. AlfonsoSTEINBERGER RALF
Citation: Procesamiento del Lenguaje Natural no. 35 p. 119-126
Publisher: Sociedad Espanola para el Procesiamento del Lenguaje Natural
Publication Year: 2005
JRC N°: JRC31101
URI: http://publications.jrc.ec.europa.eu/repository/handle/JRC31101
Type: Articles in periodicals and books
Abstract: This paper studies the use of different sources of information for performing a text classifcation task. The growing number of digital libraries imposes a review of the available data from those databases. Some experiments applying different base classifers for a multi-label classifer in the domain of High Energy Physics on several of these possible sources have been carried out. Results show that the use of metadata is almost as good as the full-text version of papers. Keywords: text categorization, machine learning, digital libraries.
JRC Directorate:Space, Security and Migration

Files in This Item:
There are no files associated with this item.


Items in repository are protected by copyright, with all rights reserved, unless otherwise indicated.