An official website of the European Union How do you know?      
European Commission logo
JRC Publications Repository Menu

Text Categorization Using Bibliographic Records - Beyond Document Content

cover
This paper studies the use of different sources of information for performing a text classifcation task. The growing number of digital libraries imposes a review of the available data from those databases. Some experiments applying different base classifers for a multi-label classifer in the domain of High Energy Physics on several of these possible sources have been carried out. Results show that the use of metadata is almost as good as the full-text version of papers. Keywords: text categorization, machine learning, digital libraries.
2006-11-24
Sociedad Espanola para el Procesiamento del Lenguaje Natural
JRC31101
https://publications.jrc.ec.europa.eu/repository/handle/JRC31101,   
Language Citation
NameCountryCityType
Datasets
IDTitlePublic URL
Dataset collections
IDAcronymTitlePublic URL
Scripts / source codes
DescriptionPublic URL
Additional supporting files
File nameDescriptionFile type 
Show metadata record  Copy citation url to clipboard  Download BibTeX
Items published in the JRC Publications Repository are protected by copyright, with all rights reserved, unless otherwise indicated. Additional information: https://ec.europa.eu/info/legal-notice_en#copyright-notice