2023-06-272023-06-272023-05-26OLIVEIRA, F. S. Avaliação da qualidade da sintetização de fala gerada por modelos de redes neurais profundas. 2023. 129 f. Tese (Doutorado em Ciência da Computação) - Universidade Federal de Goiás, Goiânia, 2023.http://repositorio.bc.ufg.br/tede/handle/tede/12916With the emergence of intelligent personal assistants, the need for high-quality conversational interfaces has increased. While text-based chatbots are popular, the development of voice interfaces is equally important. However, the primary method for evaluating voice-based conversational models is mainly done through Mean Opinion Score (MOS), which relies on a manual and subjective process. In this context, this thesis aims to contribute with a new methodology for evaluating voice-based conversational interfaces, with a case study specifically conducted in Brazilian Portuguese. The proposed methodology includes an architecture for predicting the quality of synthesized speech in Brazilian Portuguese, correlated with MOS. To evaluate the proposed methodology, this work included training Text-to-Speech models to create the dataset called BRSpeechMOS. Details about the creation of this dataset are presented, along with a qualitative and quantitative analysis of it. A series of experiments were conducted to train various architectures using the BRSpeechMOS dataset. The architectures used are based on supervised and self-supervised learning. The results obtained confirm the hypothesis raised that pre-trained models on voice processing tasks such as speaker verification and automatic speech recognition produce suitable acoustic representations for the task of predicting speech quality, contributing to the advancement of the state of the art in the development of evaluation methodologies for conversational models.Attribution-NonCommercial-NoDerivatives 4.0 InternationalAvaliação da falaAvaliação da fala sintetizadaPredição de MOSRedes neurais profundasPredição da qualidadeSpeech assessmentSynthesized speech assessmentMOS predictionDeep neural networksQuality predictionCIENCIAS EXATAS E DA TERRA::CIENCIA DA COMPUTACAO::SISTEMAS DE COMPUTACAOAvaliação da qualidade da sintetização de fala gerada por modelos de redes neurais profundasEvaluation of speech synthesis quality generated by deep neural network modelsTese