Reconhecimento de entidades nomeadas em editais de licitação

Carregando...
Imagem de Miniatura

Título da Revista

ISSN da Revista

Título de Volume

Editor

Universidade Federal de Goiás

Resumo

This work explores the use of large language models (LLMs) for information extraction in public procurement notices, focusing on the Named Entity Recognition (NER) task. Given the diverse and unstandardized nature of these documents, the study proposes a methodology that integrates semantic selection techniques with Zero-Shot and Few-Shot scenarios, aiming to optimize the annotation and entity extraction process, reduce manual intervention, and improve accuracy. The first step involved building an annotated corpus containing named entities from pro-curement notices. Subsequently, the BERTimbau, BERTikal, and mDeBERTa models were trained in a supervised manner using this annotated dataset. Experiments showed that BERTimbau achieved the best overall performance, with an F1-score above 0.80. In the Zero-Shot and Few-Shot scenarios, various prompt templates and example selection strategies were tested. Models such as GPT-4 and LLaMA achieved performance compa-rable to supervised models when aided by semantically relevant examples, despite modest results in the absence of examples. The results indicate that combining enriched prompts with examples and the pre-selection of relevant sentences during the annotation phase contributes to greater accuracy and efficiency in the NER process for procurement notices. The proposed methodology can be applied to information extraction, with potential impacts on transparency and auditing in public procurement.

Descrição

Citação

SOUZA FILHO, R. P. Reconhecimento de entidades nomeadas em editais de licitação. 2024. 63 f. Dissertação (Mestrado em Ciência da Computação) – Instituto de Informática, Universidade Federal de Goiás, Goiânia, 2024.