Aplicação de CNN e LLM na Localização de Defeitos de Software

Basílio Neto, Altino Dantas

Aplicação de CNN e LLM na Localização de Defeitos de Software

Arquivos

Tese - Altino Dantas Basílio Neto - 2024.pdf (9.49 MB)

Data

2024-10-16

Autores

Basílio Neto, Altino Dantas

Editor

Universidade Federal de Goiás

Resumo

The increase in the quantity or complexity of computational systems has led to a growth in the occurrence of software defects. The industry invests significant amounts in code debugging, and a considerable portion of the cost is associated with the task of locating the element responsible for the defect. Automated techniques for fault localization have been widely explored, with recent advances driven by the use of deep learning models that combine different types of information about defective source code. However, the accuracy of these techniques still has room for improvement, suggesting open challenges in the field. This work aims to formalize and investigate the most impactful aspects of fault localization techniques, proposing a framework for characterizing approaches to the problem and two solution methodologies: a) based on convolutional neural networks (CNNs) and b) based on large language models (LLMs). From experimentation involving public datasets in Java and Python, it was demonstrated that CNNs are comparable to traditional methods but were found to be inferior to other methods in the literature. The LLM-based approach, on the other hand, greatly outperformed heuristics like Ochiai and Tarantula and proved competitive with more recent literature. An experiment in a scenario free from the data leakage problem showed that LLM-based approaches can be improved by combining them with the Ochiai heuristic.

Palavras-chave

Localização de defeitos, Redes Neurais Artificiais, Redes Neurais Convolucionais, Modelos de Linguagem de Grande Porte, Fault Localization, Artificial Neural Network, Convolutional Neural Networks, Large Language Model

Citação

Basílio Neto, Altino Dantas. Aplicação de CNN e LLM na Localização de Defeitos de Software. Goiânia, 2024. 178 f. Tese (Doutorado em Ciência da Computação) - Instituto de Informática, Universidade Federal de Goiás, Goiânia, 2024.

URI

http://repositorio.bc.ufg.br/tede/handle/tede/13794

Coleções

Doutorado em Ciência da Computação

Página do item completo

Aplicação de CNN e LLM na Localização de Defeitos de Software

Arquivos

Data

Autores

Título da Revista

ISSN da Revista

Título de Volume

Editor

Resumo

Descrição

Palavras-chave

Citação

URI

Coleções