Doutorado em Ciência da Computação
URI Permanente para esta coleção
Navegar
Navegando Doutorado em Ciência da Computação por Título
Agora exibindo 1 - 20 de 35
Resultados por página
Opções de Ordenação
Item Abordagem de seleção de características baseada em AUC com estimativa de probabilidade combinada a técnica de suavização de La Place(Universidade Federal de Goiás, 2023-09-28) Ribeiro, Guilherme Alberto Sousa; Costa, Nattane Luíza da; http://lattes.cnpq.br/9968129748669015; Barbosa, Rommel Melgaço; http://lattes.cnpq.br/6228227125338610; Barbosa, Rommel Melgaço; Lima, Marcio Dias de; Oliveira, Alexandre César Muniz de; Gonçalves, Christiane; Rodrigues, Diego de CastroThe high dimensionality of many datasets has led to the need for dimensionality reduction algorithms that increase performance, reduce computational effort and simplify data processing in applications focused on machine learning or pattern recognition. Due to the need and importance of reduced data, this paper proposes an investigation of feature selection methods, focusing on methods that use AUC (Area Under the ROC curve). Trends in the use of feature selection methods in general and for methods using AUC as an estimator, applied to microarray data, were evaluated. A new feature selection algorithm, the AUC-based feature selection method with probability estimation and the La PLace smoothing method (AUC-EPS), was then developed. The proposed method calculates the AUC considering all possible values of each feature associated with estimation probability and the LaPlace smoothing method. Experiments were conducted to compare the proposed technique with the FAST (Feature Assessment by Sliding Thresholds) and ARCO (AUC and Rank Correlation coefficient Optimization) algorithms. Eight datasets related to gene expression in microarrays were used, all of which were used for the cross-validation experiment and four for the bootstrap experiment. The results showed that the proposed method helped improve the performance of some classifiers and in most cases with a completely different set of features than the other techniques, with some of these features identified by AUC-EPS being critical for disease identification. The work concluded that the proposed method, called AUC-EPS, selects features different from the algorithms FAST and ARCO that help to improve the performance of some classifiers and identify features that are crucial for discriminating cancer.Item Abordagem de seleção de características baseada em AUC com estimativa de probabilidade combinada a técnica de suavização de La Place(Universidade Federal de Goiás, 2024-09-28) Ribeiro, Guilherme Alberto Sousa; Costa, Nattane Luíza da; http://lattes.cnpq.br/9968129748669015; Barbosa, Rommel Melgaço; http://lattes.cnpq.br/6228227125338610; Barbosa, Rommel Melgaço; Lima, Marcio Dias de; Oliveira, Alexandre César Muniz de; Gonçalves, Christiane; Rodrigues, Diego de CastroThe high dimensionality of many datasets has led to the need for dimensionality reduction algorithms that increase performance, reduce computational effort and simplify data processing in applications focused on machine learning or pattern recognition. Due to the need and importance of reduced data, this paper proposes an investigation of feature selection methods, focusing on methods that use AUC (Area Under the ROC curve). Trends in the use of feature selection methods in general and for methods using AUC as an estimator, applied to microarray data, were evaluated. A new feature selection algorithm, the AUC-based feature selection method with probability estimation and the La PLace smoothing method (AUC-EPS), was then developed. The proposed method calculates the AUC considering all possible values of each feature associated with estimation probability and the La Place smoothing method. Experiments were conducted to compare the proposed technique with the FAST (Feature Assessment by Sliding Thresholds) and ARCO (AUC and Rank Correlation coefficient Optimization) algorithms. Eight datasets related to gene expression in microarrays were used, all of which were used for the crossvalidation experiment and four for the bootstrap experiment. The results showed that the proposed method helped improve the performance of some classifiers and in most cases with a completely different set of features than the other techniques, with some of these features identified by AUC-EPS being critical for disease identification. The work concluded that the proposed method, called AUC-EPS, selects features different from the algorithms FAST and ARCO that help to improve the performance of some classifiers and identify features that are crucial for discriminating cancer.Item Acelerando florestas de decisão paralelas em processadores gráficos para a classificação de texto(Universidade Federal de Goiás, 2022-09-12) Pires, Julio Cesar Batista; Martins, Wellington Santos; http://lattes.cnpq.br/3041686206689904; Martins, Wellington Santos; Lima, Junio César de; Gaioso, Roussian Di Ramos Alves; Franco, Ricardo Augusto Pereira; Soares, Fabrízzio Alphonsus Alves de Melo NunesThe amount of readily available on-line text has grown exponentially, requiring efficient methods to automatically manage and sort data. Automatic text classification provides means to organize this data by associating documents with classes. However, the use of more data and sophisticated machine learning algorithms has demanded an increasingly computing power. In this work we accelerate a novel Random Forest-based classifier that has been shown to outperform state-of-art classifiers for textual data. The classifier is obtained by applying the boosting technique in bags of extremely randomized trees (forests) that are built in parallel to improve performance. Experimental results using standard textual datasets show that the GPUbased implementation is able to reduce the execution time by up to 20 times compared to an equivalent sequential implementation.Item Análise multirresolução de imagens gigapixel para detecção de faces e pedestres(Universidade Federal de Goiás, 2023-09-27) Ferreira, Cristiane Bastos Rocha; Pedrini, Hélio; http://lattes.cnpq.br/9600140904712115; Soares, Fabrízzio Alphonsus Alves de Melo Nunes; http://lattes.cnpq.br/7206645857721831; Soares, Fabrízzio Alphonsus Alves de Melo Nunes; Pedrini, Helio; Santos, Edimilson Batista dos; Borges, Díbio Leandro; Fernandes, Deborah Silva AlvesGigapixel images, also known as gigaimages, can be formed by merging a sequence of individual images obtained from a scene scanning process. Such images can be understood as a mosaic construction based on a large number of high resolution digital images. A gigapixel image provides a powerful way to observe minimal details that are very far from the observer, allowing the development of research in many areas such as pedestrian detection, surveillance, security, and so forth. As this image category has a high volume of data captured in a sequential way, its generation is associated with many problems caused by the process of generating and analyzing them, thus, applying conventional algorithms designed for non-gigapixel images in a direct way can become unfeasible in this context. Thus, this work proposes a method for scanning, manipulating and analyzing multiresolution Gigapixel images for pedestrian and face identification applications using traditional algorithms. This approach is analyzed using both Gigapixel images with low and high density of people and faces, presenting promising results.Item Aplicação de técnicas de visualização de informações para os problemas de agendamento de horários educacionais(Universidade Federal de Goiás, 2023-10-20) Alencar, Wanderley de Souza; Jradi, Walid Abdala Rfaei; http://lattes.cnpq.br/6868170610194494; Nascimento, Hugo Alexandre Dantas do; http://lattes.cnpq.br/2920005922426876; Nascimento, Hugo Alexandre Dantas do; Jradi, Walid Abdala Rfaei; Bueno, Elivelton Ferreira; Gondim, Halley Wesley Alexandre Silva; Carvalho, Cedric Luiz deAn important category, or class, of combinatorial optimization problems is called Educational Timetabling Problems (Ed-TTPs). Broadly, this category includes problems in which it is necessary to allocate teachers, subjects (lectures) and, eventually, rooms in order to build a timetable, of classes or examinations, to be used in a certain academic period in an educational institution (school, college, university, etc.). The timetable to be prepared must observe a set of constraints in order to satisfy, as much as possible, a set of desirable goals. The current research proposes the use of methods and/or techniques from the Information Visualization (IV) area to, in an interactive approach, help a better understanding and resolution, by non-technical users, of problem instances in the scope of their educational institutions. In the proposed approach, human actions and others performed by a computational system interact in a symbiotic way targeting the problem resolution, with the interaction carried out through a graphical user interface that implements ideas originating from the User Hints framework [Nas03]. Among the main contributions achieved are: (1) recognition, and characterization, of the most used techniques for the presentation and/or visualization of Ed-TTPs solutions; (2) conception of a mathematical notation to formalize the problem specification, including the introduction of a new idea called flexibility applied to the entities involved in the timetable; (3) proposition of visualizations able to contribute to a better understanding of a problem instance; (4) make available a computational tool that provides interactive resolution of Ed-TTPs, together with a specific entity-relationship model for this kind of problem; and, finally, (5) the proposal of a methodology to evaluate visualizations applied to the problem in focus.Item Aprimoramento do modelo de seleção dos padrões associativos: uma abordagem de mineração de dados(Universidade Federal de Goiás, 2021-12-20) Rodrigues, Diego de Castro; Barbosa, Rommel Melgaço; http://lattes.cnpq.br/6228227125338610; Barbosa, Rommel Melgaço; Costa, Ronaldo Martins da; Costa, Nattane Luíza da; Rocha, Marcelo Lisboa; Jorge, Lúcio de CastroThe objective of this study is to improve the association rule selection model through a set of asymmetric probabilistic metrics. We present the Health Association Rules - HAR, based on Apriori, the algorithm is composed of six functions and uses alternative metrics to the Support/Confidence model to identify the implication X → Y . Initially, the application of our solution was focused only on health data, but we realized that asymmetrical associative patterns could be applied in other contexts that seek to address the cause and effect of a pattern. Our experiments were composed of 60 real datasets taken from specialist websites, research partnerships and open data. We empirically observed the behavior of HAR in all data sets, and a comparison was performed with the classical Apriori algorithm. We realized that it has overcome the main problems of the Support/Confidence model. We were able to identify the most relevant patterns for the observed datasets, eliminating logical contradictions and redundancies. We also perform a statistical analysis of the experiments where the statistical effect is positive for HAR. HAR was able to discover more representative patterns and rare patterns, in addition to being able to perform rule grouping, filtering and ranking. Our solution presented a linear behavior in the experiments, being able to be applied in health, social, content suggestion, product indication and educational data. Not limited to these data domains, HAR is prepared to receive large amounts of data by using a customized parallel architecture.Item Atribuição de papéis em alguns produtos de grafos(Universidade Federal de Goiás, 2022-06-24) Mesquita, Fernanda Neiva; Dias, Elisângela Silva; http://lattes.cnpq.br/0138908377103572; Nascimento, Julliano Rosa; http://lattes.cnpq.br/8971175373328824; Castonguay, Diane; http://lattes.cnpq.br/4005898623592261; Castonguay, Diane; Rodrigues, Rosiane de Freitas; Dourado, Mitre Costa; Nobrega, Diana Sasaki; Silva, Hebert Coelho daDuring the pandemic, due to the new coronavirus (COVID-19), the use of social networks was enhanced by social distancing and the need to stay connected, generating a gigantic volume of data. In order to extract information, graphs constitute a powerful modeling tool in which the vertices represent individuals and the edges represent relationships between them. In 1991, Everett and Borgatti formalized the concept of role assignment under the name role coloring. Thus, a r-role assignment of a simple graph G is an assignment of r distinct roles to the vertices of G, such that two vertices with the same role have the same set of roles in the related vertices. Furthermore, a specific r-role assignment defines a role graph, in which the vertices are the distinct r roles, and there is an edge between two roles whenever there are two related vertices in the graph G that correspond to these roles. Research on role assignment and operations on graphs is scarce. We showed a dichotomy for the r-role assignment problem for the Cartesian product. While the Cartesian product of two graphs always admits a 2-role assignment, the problem remains NP-complete for any fixed r ≥ 3. The complementary prism arises from the complementary product, introduced by Haynes, Henning and Van Der Merwe in 2019, which is a generalization of the Cartesian product. Complementary prisms admits a 2-role assignment, with the exception of the complementary prism of a path with three vertices. We verified that the complementary prisms admits a 3-role assignment, with the exception of the complementary prism of some not connected bipartite graphs. Next, we showed that the related problem can be solved in linear time. Finally, we conjecture that, for r ≥ 3 the problem of (r+1)-role assignment to complementary prisms is NP-complete. In this sense, we consider the role graph K'_{1,r} which is the bipartite graph K_{1,r} with a loop at the vertex of degree r and we highlight that the problem of deciding whether a prism complement has a (r+1)-role assignment, when the role graph is K'_{1,r}, it is NP-complete.Item Avaliação da qualidade da sintetização de fala gerada por modelos de redes neurais profundas(Universidade Federal de Goiás, 2023-05-26) Oliveira, Frederico Santos de; Soares, Anderson da Silva; http://lattes.cnpq.br/1096941114079527; Soares, Anderson da Silva; Aluisio, Sandra Maria; Duarte, Julio Cesar; Laureano, Gustavo Teodoro; Galvão Filho, Arlindo RodriguesWith the emergence of intelligent personal assistants, the need for high-quality conversational interfaces has increased. While text-based chatbots are popular, the development of voice interfaces is equally important. However, the primary method for evaluating voice-based conversational models is mainly done through Mean Opinion Score (MOS), which relies on a manual and subjective process. In this context, this thesis aims to contribute with a new methodology for evaluating voice-based conversational interfaces, with a case study specifically conducted in Brazilian Portuguese. The proposed methodology includes an architecture for predicting the quality of synthesized speech in Brazilian Portuguese, correlated with MOS. To evaluate the proposed methodology, this work included training Text-to-Speech models to create the dataset called BRSpeechMOS. Details about the creation of this dataset are presented, along with a qualitative and quantitative analysis of it. A series of experiments were conducted to train various architectures using the BRSpeechMOS dataset. The architectures used are based on supervised and self-supervised learning. The results obtained confirm the hypothesis raised that pre-trained models on voice processing tasks such as speaker verification and automatic speech recognition produce suitable acoustic representations for the task of predicting speech quality, contributing to the advancement of the state of the art in the development of evaluation methodologies for conversational models.Item Classificação de cenas utilizando a análise da aleatoriedade por aproximação da complexidade de Kolmogorov(Universidade Federal de Goiás, 2020-03-15) Feitosa, Rafael Divino Ferreira; Soares, Anderson da Silva; http://lattes.cnpq.br/1096941114079527; Soares, Anderson da Silva; Delbem, Alexandre Cláudio Botazzo; Soares, Fabrízzio Alphonsus Alves de Melo Nunes; Laureano, Gustavo Teodoro; Costa, Ronaldo Martins daIn many pattern recognition problems, discriminant features are unknown and/or class boundaries are not well defined. Several studies have used data compression to discover knowledge, without features extraction and selection. The basic idea is two distinct objects can be grouped as similar, if the information content of one explains, in a significant way, the information content of the other. However, compressionbased techniques are not efficient for images, as they disregard the semantics present in the spatial correlation of two-dimensional data. A classifier is proposed for estimates the visual complexity of scenes, namely Pattern Recognition by Randomness (PRR). The operation of the method is based on data transformations, which expand the most discriminating features and suppress details. The main contribution of the work is the use of randomness as a measure discrimination. The approximation between scenes and trained models, based on representational distortion, promotes a lossy compression process. This loss is associated with irrelevant details, when the scene is reconstructed with the representation of true class, or with the information degradation, when it is reconstructed with divergent representations. The more information preserved, the greater the randomness of the reconstruction. From the mathematical point of view, the method is explained by two main measures in the U-dimensional plane: intersection and dispersion. The results yielded accuracy of 0.6967, for a 12-class problem, and 0.9286 for 7 classes. Compared with k-NN and a data mining toolkit, the proposed classifier was superior. The method is capable of generating efficient models from few training samples. It is invariant for vertical and horizontal reflections and resistant to some geometric transformations and image processing.Item CLAT: arcabouço conceitual e ferramenta de apoio à avaliação da escrita inicial infantil por meio de dispositivos móveis(Universidade Federal de Goiás, 2022-12-21) Mombach, Jaline Gonçalves; Soares, Fabrizzio Alphonsus Alves de Melo Nunes; http://lattes.cnpq.br/7206645857721831; Soares, Fabrizzio Alphonsus Alves de Melo Nunes; Ferreira, Deller James; Marques, Fátima de Lourdes dos Santos Nunes; Rodrigues, Kamila Rios da Hora; Rocha, Maria Alice de Sousa CarvalhoIn childhood literacy, the assessment of initial writing is essential for monitoring learning and consequently planning more effective interventions by educators. However, during the Covid19 pandemic period, early spelling assessments were hampered since the digital tools available did not include some strategic signals, such as visualization of the child's tracing, the reading mode, and the genuine child's thinking about writing. Therefore, as a research problem, we investigate how mobile devices could support the remote child's spelling assessment. Thus, the central goal was to develop an interaction model for mobile devices to promote these writing assignments remotely. Thus, we adopted Design Science Research as a methodological approach. In the study of the problem stage, we conducted a systematic mapping study, a survey with professionals and parents, and we documented the usability requirements. Next, we proposed an artifact for educators to create digital assignments and another to capture the children's tracing and the mode they read. Finally, for validation, we performed concept tests to teachers, children, and a validation experiment in the school ecosystem, involving 92 children and six teachers. The results indicated that children were expressively interested in the resource and could interact satisfactorily on the digital artifact, validating the interaction modeling by registering their writing without significant difficulties. Furthermore, the teachers declared that it is possible to evaluate the children's spelling from the registers visualized on the digital artifact and emphasized the similarity between the interactions promoted by artifacts and the face-to-face environment. The findings of this study contribute to research on digital writing development and new educational resources. At the social level, the proposal also contributes directly to the maintenance of teaching in remote environments while also bringing new possibilities for face-to-face teaching and blended learning.Item Controle de admissão para network slicing considerando recursos de comunicação e computação(Universidade Federal de Goiás, 2023-05-10) Lima, Henrique Valle de; Cardoso, Kleber Vieira; http://lattes.cnpq.br/0268732896111424; Corrêa, Sand Luz; http://lattes.cnpq.br/3386409577930822; Corrêa, Sand Luz; Cardoso, Kleber Vieira; Oliveira Júnior, Antônio Carlos de; Costa, Ronaldo Martins da; Both, Cristiano BonatoThe 5G networks have enabled the application of various innovative and disruptive technologies such as Network Function Virtualization (NFV) and Software-Defined Networking (SDN). Together, these technologies act as enablers of Network Slicing (NS), transforming the way networks are operated, managed, and monetized. Through the concept of Slice-as-a-Service (SlaaS), telecommunications operators can monetize the physical and logical infrastructure by offering network slices to new customers, such as vertical industries. This thesis addresses the problem of tenant admission control using NS. We propose three admission control models for NS (MONETS-OBD, MONETS-OBS, and CAONS) that consider both communication and computation resources. To evaluate the proposed models, we compare the results with different classical algorithms from the literature, such as eUCB, e-greedy, and ONETS. We use data from different applications to enrich the analysis. The results indicate that the MONETS-OBD, MONETS-OBS, and CAONS heuristics perform admission control that approaches the set of ideal solutions. We achieve high efficiency with the MONETS-OBD and MONETS-OBS heuristics in controlling tenant admission, reaching acceptance rates of up to 99% in some cases. Furthermore, the CAONS heuristic, which employs penalties, not only achieves acceptance and reward rates close to the optimal solution but also significantly reduces the number of capacity violations. Lastly, the results highlight that the process of slice admission control should consider both communication and computation resources, which are scarce at the network edge. A solution that considers only communication resources can lead to incorrect and unfeasible interpretations, overestimating the capacity of computation resources.Item Design de experiência aplicado a times(Universidade Federal de Goiás, 2024-10-18) Alves, Leonardo Antonio; Soares, Anderson da Silva; http://lattes.cnpq.br/1096941114079527; Soares, Anderson da Silva; Ferreira, Deller James; Lucena, Fábio Nogueira de; Dias, Rodrigo da Silva; Federson, Fernando MarquesDespite recent advances, current Gamification methodologies still face challenges in effectively personalizing learning experiences and accurately assessing the development of specific competencies. This thesis presents the Marcta Autonomy Framework (MAF), an innovative framework that aims to overcome these limitations by increasing team members’ motivation and participation while promoting personal development and skills through a personalized experience.The MAF, consisting of six phases (Planning, Reception, Advancement, Feedback, Process Evaluation, and Lessons and Adjustments), guides the development of activities with both intrinsic and extrinsic rewards. The research was applied in two academic case studies: a Software Factory and an Introduction to Programming course for students of the Bachelor’s degree in Artificial Intelligence. Using a qualitative approach, including interviews and observations, the results demonstrate that the MAF significantly enhances the development of personal skills. The analysis suggests that the framework can be applied both within a course and in a specific discipline. The main contribution of the MAF lies in its ability to provide a structured roadmap for planning and evaluating pedagogical actions focused on Personal Skills Development.Furthermore, the framework leverages easily capturable data through observation, context, and evaluations. It is concluded that the MAF stands as a personalized and affective Gamification solution for Experience Design in Learning, promoting Personal Skills Development in both academic and corporate contexts.Item Detecção automática e avaliação de linhas de plantio de cana-de-açúcar em imagens aéreas(Universidade Federal de Goiás, 2021-12-09) Rocha, Bruno Moraes; Pedrini, Hélio; http://lattes.cnpq.br/9600140904712115 Nome completo do 2º coorientador(a): E-mail: Nomes completos; Soares, Fabrízzio Alphonsus Alves de Melo Nunes; http://lattes.cnpq.br/7206645857721831; Soares, Fabrízzio Alphonsus Alves de Melo Nunes; Pedrini, Hélio; Salvini, Rogerio Lopes; Costa, Ronaldo Martins da; Cabacinha, Christian DiasFor higher productivity and economic yield in sugarcane field, several imaging techniques using sugarcane field images have been developed. However, the identification and measurement of gaps in sugarcane field crop rows are still commonly performed manually on site to decide to replant the gaps or the entire area. Manual measurement has a high cost of time and manpower. Based on these factors, this study aimed to create a new technique that automatically identifies and evaluates the gaps along the crop rows in aerial images of sugarcane fields obtained by a small remotely piloted aircraft. The images captured using the remotely piloted aircraft were used to generate the orthomosaics of the crop field area and classified with the algorithm K-Nearest Neighbors to segment the crop rows. The orientation of the planting rows in the image was found using the filter gradient Red Green Blue. Then, the crop rows were mapped using the curve adjustment method and overlap the classified image to detect and measure the gaps along the segment of the planting line. The technique developed obtained a maximum error of approximately 3% when compared to the manual method to evaluate the length of the gaps in the crop rows in an orthomosaic with an area of 8.05 hectares using the method proposed by Stolf, adapted for digital images. The proposed approach was able to properly identify the spatial position of automatically generated line segments over manually created line segments. The proposed method was also able to achieve statistically similar results when confronted with the technique performed manually in the image for the mapping of rows and identification of gaps for sugarcane fields with growth 40 and 80 days after planting. The automatic technique developed had a significant result in the evaluation of the gaps in the crop rows in the aerial images of sugarcane fields, thus, its use allows automated inspections with high accuracy measurements, and besides being able to assist producers in making decisions in the management of their sugarcane fields.Item Escalonamento de recursos em redes sem fio 5G baseado em otimização de retardo e de alocação de potência considerando comunicação dispositivo a dispositivo(Universidade Federal de Goiás, 2021-10-15) Ferreira, Marcus Vinícius Gonzaga; Vieira, Flávio Henrique Teles; http://lattes.cnpq.br/0920629723928382; Vieira, Flávio Henrique Teles; Madeira, Edmundo Roberto Mauro; Lima, Marcos Antônio Cardoso de; Rocha, Flávio Geraldo Coelho; Oliveira Júnior, Antônio Carlos deIn this thesis, a resources scheduling scheme is proposed for 5G wireless network based on CP-OFDM (Cyclic Prefix - Orthogonal Frequency Division Multiplexing) and f-OFDM (filtered - OFDM) modulations in order to optimize the average delay and the power allocation for users. In the proposed approach the transmission rate value is calculated and the modulation format is defined so that minimize system BER (Bits Error Rate). The algorithm considers, in addition to the transmission modes determined to minimize the BER, the calculation of the system's weighted throughput to optimize the users' average delay. Additionally, it is proposed an algorithm for uplink transmission in 5G wireless networks with D2D (Device-to-device) multi-sharing communication which initially allocates resources for the CUEs (Cellular User Equipments) and subsequently allocates network resources for communication between DUEs (D2D User Equipment) pairs based in the optimization of the delay and power allocation. The proposed algorithm, namely DMCG (Delay Minimization Conflict Graph), considers the minimization of the estimated delay function using concepts of Network Calculus to decide on the allocation of idle resources of the network CUEs for DUEs pairs. In this thesis, the performance of the proposed algorithms for downlink and uplink transmission are verified and compared with others algorithms in the literature in terms of several QoS (Quality of Service) parameters and considering the carrier aggregation and 256-QAM (Quadrature Amplitude Modulation) technologies. In computational simulations they are also considered scenarios with propagation by millimeter waves and the 5G specifications of the 3GPP (3rd Generation Partnership Project) Release 15. The simulation results show that the algorithms proposed for downlink and uplink transmission provide better system performance in terms of throughput and delay, in addition to presenting lower processing time compared to optimization heuristics and other QoS parameters being compatible to those of the compared algorithms.Item Escolha de parâmetros aplicados a modelos inteligentes para o incremento da qualidade do aprendizado de sinais de EEG captados por dispositivo de baixo custo(Universidade Federal de Goiás, 2024-07-10) Silva, Uliana Duarte; Felix, Juliana Paula; http://lattes.cnpq.br/3610115951590691; Nascimento, Hugo Alexandre Dantas do; http://lattes.cnpq.br/2920005922426876; Nascimento, Hugo Alexandre Dantas do; Pires, Sandrerley Ramos; Carvalho, Sérgio Teixeira de; Carvalho, Sirlon Diniz de; Melo, Francisco Ramos deSince the creation of the first electroencephalography (EEG) equipment at the beginning of the 20th century, several studies have been carried out based on this technology. More recently, investigations into machine learning applied to the classification of EEG signals have started to become popular. In these researches, it is common to adopt a sequence of steps that involves the use of filters, signal windowing, feature extraction and division of data into training and test sets. The choice of parameters for such steps is an important task, as it impacts classification performance. On the other hand, finding the best combination of parameters is an exhaustive work that has only been partially addressed in studies in the area, particularly when considering many parameter options, the progressive growth of the training set and data acquired from low-cost EEG equipment. This thesis contributes to the area by presenting an extensive research on the choice of parameters for processing and classifying of EEG signals, involving both raw signals and specific wave data collected from a low-cost equipment. The EGG signals acquisition was done with ten participants, who were asked to observe a small white ball that moved to the right, left or remained stationary. The observation was repeated 24 times randomly and each observation situation lasted 18 seconds. Different parameter settings and machine learning methods were evaluated in classifying EEG signals. We sought to find the best parameter configuration for each participant individually, as well as obtain a common configuration for several participants simultaneously. The results for the individualized classifications indicate better accuracies when using data from specific waves instead of raw signals. Using larger windows also led to better results. When choosing a common parameter combination for multiple participants, the results indicate a similarity to findings when looking for the best parameters for individual participants. In this case, the parameter combinations using data from specific waves showed an average increase of 8.69% with a standard deviation of 4.02%, while the average increase using raw signals was 7.82% with a standard deviation of 2.81%, when compared to general average accuracy results. Still in the case of the parameterization common to several participants, the maximum accuracies using data from specific waves were higher than those obtained with the raw signals, and the largest windows appeared among the best results.Item Estudo, definição e proposta de representação de interface web visando à atividade de teste de software(Universidade Federal de Goiás, 2016-04-01) Jorge, Rodrigo Funabashi; Vincenzi, Auri Marcelo Rizzo; Vincenzi, Auri Marcelo Rizzo; Camilo Júnior, Celso Gonçalves; Leitão Júnior, Plínio de Sá; Oliveira, Celso Socorro; Jubileu, Andrea PadovanThe main purpose of software engineering is to subsidy the software development, from its specification to its implementation and maintenance, by applying methods, processes and tools seeking for a higher quality software product. One of the activities to get the desired quality is software testing. This activity can become very complex, depending on the characteristics and dimensions of the software product under developed and thus, is subjected to various kinds of problems which, eventually, may result on a product with faults, jeopardizing its quality. Despite the complexity and limitations of testing, there are in the literature different techniques that can be used to generate test data to satisfy several testing criteria, aiming at to reduce the cost of testing. However, generation of test data is an undecidable problem due to the complexity, constraints, and size of programs. One of the factors that increase the complexity is the use of user interfaces (UI), present in many applications. This complexity is a result of the high number of available input combinations, making it virtually impossible to hold the UI tests manually. Among the alternatives that enable the automation of the most recognized and advantageous is the UI based testing models. This technique involves the construction of a model to abstract the UI elements, its interactions, and structure to be tested. From this model, the test data can be generated. However, a troublesome factor in this approach lies in building the model. This process can be costly and time consuming. Additionally, even after the effort, the model can be incomplete and may not represent precisely the actual characteristics of the application. When studying the state of the art for testing UI, we noted that there are tools that allow to perform such testing automatically. But they also have some limitations, mainly arising from the representation model adopted. Thus the purpose of this thesis is to propose a UI representation model that brings benefits over existing representations in today literature, evolving the state of art on this area. With the proposal of this model we can represent, with the greatest level of detail, a graphical interface for web software applications. A preliminary study comparing the model with others available in the literature, highlights the benefits achieved.Item Exploiting parallelism in document similarity tasks with applications(Universidade Federal de Goiás, 2019-09-05) Amorim, Leonardo Afonso; Martins, Wellington Santos; http://lattes.cnpq.br/3041686206689904; Martins, Wellington Santos; Vincenzi, Auri Marcelo Rizzo; Rodrigues, Cássio Leonardo; Rosa, Thierson Couto; Martins, WeberThe amount of data available continues to grow rapidly and much of it corresponds to text expressing human language, that is unstructured in nature. One way of giving some structure to this data is by converting the documents to a vector of features corresponding to word frequencies (term count, tf-idf, etc) or word embeddings. This transformation allows us to process textual data with operations such as similarity measure, similarity search, classification, among others. However, this is only possible thanks to more sophisticated algorithms which demand higher computational power. In this work, we exploit parallelism to enable the use of parallel algorithms to document similarity tasks and apply some of the results to an important application in software engineering. The similarity search for textual data is commonly performed through a k nearest neighbor search in which pairs of document vectors are compared and the k most similar are returned. For this task we present FaSSTkNN, a fine-grain parallel algorithm, that applies filtering techniques based on the most common important terms of the query document using tf-idf. The algorithm implemented on a GPU improved the top k nearest neighbors search by up to 60x compared to a baseline, also running on a GPU. Document similarity using tf-idf is based on a scoring scheme for words that reflects how important a word is to a document in a collection. Recently a more sophisticated similarity measure, called word embedding, has become popular. It creates a vector for each word that indicates co-occurrence relationships between words in a given context, capturing complex semantic relationships between words. In order to generate word embeddings efficiently, we propose a fine-grain parallel algorithm that finds the k less similar or farthest neighbor words to generate negative samples to create the embeddings. The algorithm implemented on a multi-GPU system scaled linearly and was able to generate embeddings 13x faster than the original multicore Word2Vec algorithm while keeping the accuracy of the results at the same level as those produced by standard word embedding programs. Finally, we applied our accelerated word embeddings solution to the problem of assessing the quality of fixes in Automated Software Repair. The proposed implementation was able to deal with large corpus, in a computationally efficient way, being a promising alternative to the processing of million source code files needed for this task.Item Framework para sistemas de recomendação baseados em neural contextual Bandits com restrição de justiça(Universidade Federal de Goiás, 2024-06-03) Santana, Marlesson Rodrigues Oliveira de; Soares, Anderson da Silva; http://lattes.cnpq.br/1096941114079527; Soares, Anderson da Silva; Rosa, Thierson Couto; Carvalho, Cedric Luiz De; Araújo, Aluizio Fausto Ribeiro; Veloso, AdrianoThe advent of digital businesses such as marketplaces, in which a company mediates a commercial transaction between different actors, presents challenges to recommendation systems as it is a multi-stakeholder scenario. In this scenario, the recommendation must meet conflicting objectives between the parties, such as relevance versus exposure, for example. State-of-the-art models that address the problem in a supervised way not only assume that the recommendation is a stationary problem, but are also user-centered, which leads to long-term system degradation. This thesis focuses on modeling the recommendation system as a reinforcement learning problem, through a Markovian decision-making process with uncertainty where it is possible to model the different interests of stakeholders in an environment with fairness constraints. The main challenges are the need for real interactions between stakeholders and the recommendation system in a continuous cycle of events that enables the scenario for online learning. For the development of this work, we present a model proposal, based on Neural Contextual Bandits with fairness constrain for multi-stakeholder scenarios. As results, we present the construction of MARS-Gym, a framework for modeling, training and evaluating recommendation systems based on reinforcement learning, and the development of different recommendation policies with fairness control adaptable to Neural models. Contextual Bandits, which led to an increase in fairness metrics for all scenarios presented while controlling the reduction in relevance metrics.Item FTMES@r: um método de localização de defeitos baseado em estratégias de execução de mutantes(Universidade Federal de Goiás, 2018-12-18) Oliveira, André Assis Lôbo de; Camilo Júnior, Celso Gonçalves; http://buscatextual.cnpq.br/buscatextual/visualizacv.do?id=K4736184D1; Camilo Júnior, Celso Gonçalve; Vincenzi, Auri Marcelo Rizzo; Rodrigues, Cássio Leonardo; Freitas, Eduardo Noronha de Andrade Freitas; Leitão, Plínio de SáFault localization has been one of the most manual and costly software debugging activities. The spectrum-based fault localization is the most studied and evaluated fault localization approach. Mutation-based fault localization is a promising approach to the efficacy of localization but with a high computational cost due to the executions between test cases and programs mutants. In this context, this thesis purposes FTMES@r: a fault localization method to reduce the computational MBFL cost while maintaining the efficacy of localization. Differing from all reduction techniques, FTMES@r optimizes two stages: i) the selection of program elements (SFilter@r) and ii) the execution of the mutants (FTMES). The SFilter@r component uses the accuracy of the SBFL approach in forming a smaller ranking by selecting the program elements up to a given position @r of the ranking of all elements. Thus, SFilter@r employs the first level of cost reduction of MBFL because the generation of mutants considers only the program elements of this reduced rank. In the mutants execution stage, the Failed-Test-Oriented Mutant Execution Strategy (FTMES) component applies the second level of cost reduction by running mutants only with the set of failed test cases (Tf) and using the mutants with the set of test cases that pass (Tp). The experimentation comprises a comparison of 10 localization techniques, 221 real defects, and 6 evaluation metrics. The results show that FTMES@r presents the best cost-benefit relationship among the studied techniques.Item Future-Shot: Few-Shot Learning to tackle new labels on high-dimensional classification problems(Universidade Federal de Goiás, 2024-02-23) Camargo, Fernando Henrique Fernandes de; Soares, Anderson da Silva; http://lattes.cnpq.br/1096941114079527; Soares, Anderson da Silva; Galvão Filho, Arlindo Rodrigues; Vieira, Flávio Henrique Teles; Gomes, Herman Martins; Lotufo, Roberto de AlencarThis thesis introduces a novel approach to address high-dimensional multiclass classification challenges, particularly in dynamic environments where new classes emerge. Named Future-Shot, the method employs metric learning, specifically triplet learning, to train a model capable of generating embeddings for both data points and classes within a shared vector space. This facilitates efficient similarity comparisons using techniques like k-nearest neighbors (\acrshort{knn}), enabling seamless integration of new classes without extensive retraining. Tested on lab-of-origin prediction tasks using the Addgene dataset, Future-Shot achieves top-10 accuracy of $90.39\%$, surpassing existing methods. Notably, in few-shot learning scenarios, it achieves an average top-10 accuracy of $81.2\%$ with just $30\%$ of the data for new classes, demonstrating robustness and efficiency in adapting to evolving class structures