Variações do método kNN e suas aplicações na classificação automática de textos
Carregando...
Data
2010-10-10
Autores
Título da Revista
ISSN da Revista
Título de Volume
Editor
Universidade Federal de Goiás
Resumo
Most research on Automatic Text Categorization (ATC) seeks to improve the classifier
performance (effective or efficient) responsible for automatically classifying a document
d not yet rated. The k nearest neighbors (kNN) is simpler and it s one of automatic
classification methods more effective as proposed. In this paper we proposed two kNN
variations, Inverse kNN (kINN) and Symmetric kNN (kSNN) with the aim of improving
the effectiveness of ACT. The kNN, kINN and kSNN methods were applied in Reuters,
20ng and Ohsumed collections and the results showed that kINN and kSNN methods
were more effective than kNN method in Reuters and Ohsumed collections. kINN and
kSNN methods were as effective as kNN method in 20NG collection. In addition, the
performance achieved by kNN method is more stable than kINN and kSNN methods
when the value k change. A parallel study was conducted to generate new features in
documents from the similarity matrices resulting from the selection criteria for the best
results obtained in kNN, kINN and kSNN methods. The SVM (considered a state of the
art method) was applied in Reuters, 20NG and Ohsumed collections - before and after
applying this approach to generate features in these documents and the results showed
statistically significant gains for the original collection.
Descrição
Palavras-chave
Classificação de Textos , Aprendizagem de Máquina , Método kNN , Critérios de Seleção , Geração de Características , Geração de Termos , Text Classification , Machine Learning , kNN Method , Feature Selection , Feature Construction , 1.Classificação de textos 2.Aprendizagem de máquina 3.Método kNN 4.Critérios de seleção 5.Geração de características 6.Geração de termos
Citação
SANTOS, Fernando Chagas. kNN Method Variations and its applications in Text
Classification. 2010. 96 f. Dissertação (Mestrado em Ciências Exatas e da Terra - Ciências da Computação) - Universidade Federal de Goiás, Goiânia, 2010.