Inflação de zeros nas notas da redação do ENEM: comparação entre o modelo beta inflacionado em zero e o modelo de barreira
Carregando...
Data
Autores
Título da Revista
ISSN da Revista
Título de Volume
Editor
Universidade Federal de Goiás
Resumo
The essay score of the National High School Exam (ENEM), bounded to the interval [0, 1000], presents a considerable proportion of zero scores, a phenomenon known as zero inflation. This structural characteristic of the data requires the use of specialized statistical models capable of handling the hybrid nature of the distribution, which consists of a point mass at zero and a continuous component. The primary objective of this work is to identify the factors impacting the distribution of essay scores for students from public and private schools. As a secondary objective, of a methodological nature, this study seeks to compare the adequacy and robustness of two modeling strategies: the Zero-Inflated Beta model (BEINF0) and the Hurdle Model. For this purpose, ENEM microdata made available by the Anísio Teixeira National Institute of Educational Studies and Research (INEP) were utilized. The analysis was conducted on two distinct populations: one restricted to Goiânia (2023) and a broader one covering the state of Goiás (2021-2023). The models were implemented within the GAMLSS framework in the R statistical software, where the Hurdle Model was specified with a binary component (Logistic) and a continuous intensity component (modeled with the Box-Cox t distribution). Descriptive analysis results indicate significant disparities in performance and socioeconomic profiles between students from public and private schools. Although both modeling approaches identified objective scores and socioeconomic variables as relevant predictors, diagnostic analysis (such as worm plots and residual statistics) demonstrated that the Hurdle Model, despite remaining inadequacies, is methodologically more robust and conceptually more aligned with the exam's evaluation structure. Both models struggled to capture the shape of the data distribution, although the hurdle approach showed superior performance.
Descrição
Palavras-chave
Citação
LIMA, João Marcos Ribeiro. Inflação de zeros nas notas da redação do ENEM: comparação entre o modelo beta inflacionado em zero e o modelo de barreira. 2025. 75 f. Trabalho de Conclusão de Curso (Bacharelado em Estatística) – Instituto de Matemática e Estatística, Universidade Federal de Goiás, Goiânia, 2025.