Avaliação de Grandes Modelos de Linguagem para Classificação de Documentos Jurídicos em Português
Nenhuma Miniatura disponível
Data
2024-11-26
Autores
Título da Revista
ISSN da Revista
Título de Volume
Editor
Universidade Federal de Goiás
Resumo
The increasing procedural demand in judicial institutions has caused a workload overload,
impacting the efficiency of the legal system. This scenario, exacerbated by limited human
resources, highlights the need for technological solutions to streamline the processing
and analysis of documents. In light of this reality, this work proposes a pipeline for
automating the classification of these documents, evaluating four methods of representing
legal texts at the pipeline’s input: original text, summaries, centroids, and document
descriptions. The pipeline was developed and tested at the Public Defender’s Office of
the State of Goiás (DPE-GO). Each approach implements a specific strategy to structure
the input texts, aiming to enhance the models’ ability to interpret and classify legal
documents. A new Portuguese dataset was introduced, specifically designed for this
application, and the performance of Large Language Models (LLMs) was evaluated in
classification tasks. The analysis results demonstrate that the use of summaries improves
classification accuracy and maximizes the F1-score, optimizing the use of LLMs by
reducing the number of tokens processed without compromising precision. These findings
highlight the impact of textual representations of documents and the potential of LLMs
for the automatic classification of legal documents, as in the case of DPE-GO. The
contributions of this work indicate that the application of LLMs, combined with optimized
textual representations, can significantly increase the productivity and quality of services
provided by judicial institutions, promoting advancements in the overall efficiency of the
legal system.
Descrição
Citação
SANTOS, W. F. Avaliação de Grandes Modelos de Linguagem para Classificação de Documentos Jurídicos em Português.2024. 102p. Dissertação (Mestrado em Ciência da Computação) - Instituto de Informática, Universidade Federal de Goiás, Goiânia, 2024.