2025-07-012025-07-012025-04-24INUZUKA, M. A. Decomposição de tarefas para problemas de linguagem natural: segmentação de hashtags e anotação de texto argumentativo. 2025. 293 f. Tese (Doutorado em Ciência da Computação) - Instituto de Informática, Universidade Federal de Goiás, Goiânia, 2025.https://repositorio.bc.ufg.br/tede/handle/tede/14460Corpus annotation is essential for training Natural Language Processing (NLP) models, yet it faces challenges such as high cognitive complexity, annotator inconsistency, and elevated costs. This thesis proposes task decomposition as a methodological strategy to modularize complex NLP processes, promoting greater conceptual clarity, scalability, and reproducibility. Initially focused on Argument Mapping, the research redirected its scope due to the infeasibility of the original task, concentrating on the identification of reusable patterns applicable to annotation and automation stages. Guidelines, a hierarchical decomposition algorithm, and artifacts such as annotated datasets and the Argmap platform — which supports collaborative annotation with quality control — were developed. The approach was validated through three empirical case studies: hashtag segmentation, keyphrase curation, and annotation of argumentative structures. Results demonstrate that decomposition improves consistency among agents (human or automatic), guideline clarity, and automation feasibility. The thesis also introduces the Recruiter–Selector architectural pattern, which structures tasks into two independent modules — candidate generation and final selection — applicable to both annotation workflows and algorithms based on Large Language Models (LLMs). It concludes that decomposition driven by reusable patterns enhances efficiency and reliability in corpus construction and the development of robust NLP systems, contributing to the systematization of annotation processes and their integration with automatic solutionsAcesso Abertohttp://creativecommons.org/licenses/by-nc-nd/4.0/Anotação de corpusProcessamento de Linguagem NaturalQualidade de dadosPadrões reutilizáveisLLMsDecomposição de tarefasCorpus annotationNatural Language ProcessingData qualityReusable patternsTask decompositionCIENCIAS EXATAS E DA TERRA::CIENCIA DA COMPUTACAODecomposição de tarefas para problemas de linguagem natural: segmentação de hashtags e anotação de texto argumentativoTask decomposition to natural language problems: hashtag segmentation and annotation argumentativeTese