Classificação automática de documentos: seleção customizada do classificador

Nenhuma Miniatura disponível

Data

2020-11-23

Título da Revista

ISSN da Revista

Título de Volume

Editor

Universidade Federal de Goiás

Resumo

The recent increase in digitally stored data has spurred the development of methods to organize and extract relevant knowledge from this large volume of data. Automatic document classification (ADC) is one such method. Considered one of the most relevant and challenging tasks in the context of information retrieval, due to the high dimensionality and sparse data, it uses machine learning techniques to group similar documents into classes. Recent works advocate the use of multiple classifier systems (MCS) to improve the accuracy of ADC, through the combination of a set of classifiers to obtain better results in relation to a single classifier. One of the most promising approaches to MCS is dynamic selection (DS), where the base classifiers are selected in real time, according to each new consultation document (test) to be classified. This work proposes the customized selection of the classification method performed in consultation time (test). Only the most competent classifier, or the most competent set of classifiers, is selected to predict the label of each consultation document. In addition, the paper presents the exploration of parallelism to speed up the ADC task. Experimental results, using standardized databases, show competitive and promising results in relation to the baselines used. New opportunities for exploring parallelism are also presented as future work.

Descrição

Citação

SILVA, P. H. Classificação automática de documentos: seleção customizada do classificador. 2020. 80 f. Dissertação (Mestrado em Ciência da Computação) - Universidade Federal de Goiás, Goiânia, 2020.