Aplicação de CNN e LLM na Localização de Defeitos de Software
Nenhuma Miniatura disponível
Data
2024-10-16
Autores
Título da Revista
ISSN da Revista
Título de Volume
Editor
Universidade Federal de Goiás
Resumo
The increase in the quantity or complexity of computational systems has led to a growth
in the occurrence of software defects. The industry invests significant amounts in code
debugging, and a considerable portion of the cost is associated with the task of locating
the element responsible for the defect. Automated techniques for fault localization have
been widely explored, with recent advances driven by the use of deep learning models
that combine different types of information about defective source code. However, the
accuracy of these techniques still has room for improvement, suggesting open challenges
in the field. This work aims to formalize and investigate the most impactful aspects of
fault localization techniques, proposing a framework for characterizing approaches to
the problem and two solution methodologies: a) based on convolutional neural networks
(CNNs) and b) based on large language models (LLMs). From experimentation involving
public datasets in Java and Python, it was demonstrated that CNNs are comparable to
traditional methods but were found to be inferior to other methods in the literature. The
LLM-based approach, on the other hand, greatly outperformed heuristics like Ochiai and
Tarantula and proved competitive with more recent literature. An experiment in a scenario
free from the data leakage problem showed that LLM-based approaches can be improved
by combining them with the Ochiai heuristic.
Descrição
Citação
Basílio Neto, Altino Dantas. Aplicação de CNN e LLM na Localização de Defeitos de Software. Goiânia, 2024. 178 f. Tese (Doutorado em Ciência da Computação) - Instituto de Informática, Universidade Federal de Goiás, Goiânia, 2024.