Mestrado em Ciência da Computação (INF)
URI Permanente para esta coleção
Navegar
Navegando Mestrado em Ciência da Computação (INF) por Título
Agora exibindo 1 - 20 de 290
Resultados por página
Opções de Ordenação
Item A comparative study of text classification techniques for hate speech detection(Universidade Federal de Goiás, 2022-01-27) Silva, Rodolfo Costa Cezar da; Rosa, Thierson Couto; http://lattes.cnpq.br/4414718560764818; Rosa, Thierson Couto; Moura, Edleno Silva de; Silva, Nádia Félix Felipe daThe dissemination of hate speech on the Internet, specially on social media platforms, has been a serious and recurrent problem. In the present study, we compare eleven methods for classifying hate speech, including traditional machine learning methods, neural network-based approaches and transformers, as well as their combination with eight techniques to address the class imbalance problem, which is a recurrent issue in hate speech classification. The data transformation techniques we investigated include data resampling techniques and a modification of a technique based on compound features (c_features).All models have been tested on seven datasets with varying specificity, following a rigorous experimentation protocol that includes cross-validation and the use of appropriate evaluation metrics, as well as validation of the results through appropriate statistical tests for multiple comparisons. To our knowledge, there is no broader comparative study in data enhancing techniques for hate speech detection, nor any work that combine data resampling techniques with transformers. Our extensive experimentation, based on over 2,900measurements, reveal that most data resampling techniques are ineffective to enhance the effectiveness of classifiers, with the exception of ROS which improves most classification methods, including the transformers. For the smallest dataset, ROS provided gains of 60.43% and 33.47% for BERT and RoBERTa, respectively. The experiments revealed that c_features improved all classification methods that they could be combined with. The compound features technique provided satisfactory gains of up to 7.8% for SVM. Finally,we investigate cost-effectiveness for a few of the best classification methods. This analysis provided confirmation that the traditional method Logistic Regression (LR) combined with the use of c_features can provide great effectiveness with low overhead in all datasets consideredItem Abordagem baseada em metamodelos para a representação e modelagem de características em linhas de produto de software dinâmicas(Universidade Federal de Goiás, 2016-09-06) Silva, Flayson Potenciano e; Carvalho, Sérgio Teixeira de; http://lattes.cnpq.br/2721053239592051; Carvalho, Sérgio Teixeira de; Gomes, Alan Keller; Souza-Zinader, Juliana Pereira deThis dissertation presents a requirement representation approach for Dynamic Software Product Lines (DSPLs). DSPLs are oriented towards the designing of adaptive applications and each requirement is represented as a feature. Traditionally, features are represented in a Software Product Line (SPL) by a Feature Model (FM). Nonetheless, such a model does not originally support dynamic features representation. This dissertation proposes an extension to FM by adding a representation for dynamic feature to it so that the model can have a higher expressivity regarding the context change conditions and the application itself. Therefore, a metamodel based on Ecore meta-metamodel has been developed to enable the definition of both Dynamic Feature Models (proposed extension to FM) and Dynamic Feature Configurations (DFC), the latter used to describe the possible configuration of products at-runtime. In addition to a representation for dynamic features and the metamodel, this dissertation provides a tool that interprets the proposed model and allows Dynamic Feature Models design. Simulations involving dynamic feature state changes have been carried out, considering scenarios of a ubiquitous monitoring application for homecare patients.Item Uma abordagem baseada em modelos para construção automática de interfaces de usuário para Sistemas de Informação(Universidade Federal de Goiás, 2011-06-15) COSTA, Sofia Larissa da; OLIVEIRA, Juliano Lopes de; http://lattes.cnpq.br/8890030829542444Building user interfaces for Information Systems (IS) involves modeling and coding appearance (presentation) and behavioral (interaction) aspects. This work presents a modelbased approach to building these interfaces using tools for automatic transformation of models and for interface code generation. The proposed approach applies the concept of Interface Stereotype, introduced in this work, which identifies, in a high level of abstraction, features of user interface (UI) appearance and behavior, independently of the underlying IS application. A taxonomy of interface elements is proposed as the basis for stereotype definition, along with a interface behavior specification mechanism, which allows expressing actions and restrictions on the stereotypes by precise, objective and independently from the interface implementation platform. It is also proposed a architecture for a software component which manages model-based user interfaces building. The architecture defines how this component can be integrated in IS development process. The approach for model-based user interface development proposed in this work brings benefits in effort and cost construction terms, facilitating the maintenance and the evolution of user interface of IS. Futhermore, the use of stereotypes promotes consistency and standardization of both presentation and behavior of interfaces, improving usability of IS.Item Uma abordagem computacional para predição de mortalidade em utis baseada em agrupamento de processos gaussianos(Universidade Federal de Goiás, 2016-09-09) Caixeta, Rommell Guimarães; Soares, Anderson da Silva; http://lattes.cnpq.br/1096941114079527; Bulcão Neto, Renato de Freitas; http://lattes.cnpq.br/5627556088346425; Bulcão Neto, Renato de Freitas; Soares, Anderson da Silva; Laureano, Gustavo Teodoro; Oliveira, Marco Antônio Assfalk deThe analysis of physiological variables of a patient can improve the death risk classification in Intensive Care Units(ICU) and help decision making and resource management. This work proposes a computational approach to death prediction through physiological variables analysis in ICU. Physiological variables that compounds time-series(e.g., blood pressure) are represented as Dependent Gaussian Processes(DGP). Variables that do not represent time-series (e.g., age) are used to cluster DGPs with Decision Trees. Classification is made according to a distance measure that combines Dynamic Time Warping and Kullback-Leibler divergence. The results of this approach are superior to other method already used, SAPS-I, on the considered test dataset.The results are similar to other computational methods published by the research community. The results comparing variations of the proposed method show that there is advatage in using the proposed clustering of DGPs.Item Uma abordagem evolucionária para o teste de instruções select SQL com o uso da análise de mutantes(Universidade Federal de Goiás, 2013-08-02) Monção, Ana Claudia Bastos Loureiro; Rodrigues, Cássio Leonardo; Camilo Júnior, Celso Gonçalves; http://lattes.cnpq.br/6776569904919279; Camilo Júnior, Celso Gonçalves; Leitão Júnior, Plínio de Sá; Rodrigues, Cássio Leonardo; Souza, Jerffeson Teixeira deSoftware Testing is an important area of Software Engineering to ensuring the software quality. It consists of activities that involve long time and high costs, but need to be made throughout the process of building software. As in other areas of software engineering, there are problems in the activities of Software Testing whose solution is not trivial. For these problems, several techniques of optimization and search have been explored trying to find an optimal solution or near optimal, giving rise to lines of research textit Search-Based Software Engineering (SBSE) and textit Search-Based Software Testing (SBST). This work is part of this context and aims to solve the problem of selecting test data for test execution in SQL statements. Given the number of potential solutions to this problem, the proposed approach combines techniques Mutation Analysis for SQL with Evolutionary Computation to find a reduced data set, that be able to detect a large number of defects in SQL statements of a particular application. Based on a heuristic perspective, the proposal uses Genetic Algorithms (GA) to select tuples from a existing database (from production environment) trying to reduce it to a set of data relevant and effective. During the evolutionary process, Mutation Analysis is used to evaluate each set of test data selected by the AG. The results obtained from the experiments showed a good performance using meta-heuristic of Genetic Algorithms, and its variations.Item Uma abordagem ontológica baseada em informações de contexto para representação de conhecimento de monitoramento de sinais vitais humanos(Universidade Federal de Goiás, 2013-10-21) Bastos, Alexsandro Beserra; Sene Junior, Iwens Gervasio; http://lattes.cnpq.br/3693296350551971; Neto, Renato de Freitas Bulcão; http://lattes.cnpq.br/5627556088346425; Carvalho, Sérgio Teixeira de; Vanni, Renata Maria PortoMonitoring vital signs in intensive care units (ICU) is an everyday activity of various health professionals, including doctors, nurses, technicians and nursing assistants. In most ICUs, monitoring and recording vital signs are performed in a manual fashion and in predefined time instants. The records of measurements of vital signs in ICUs are generally written on preprinted forms, and a health professional has to re-sort those forms when he wants to get some information about the clinical state of a patient. Besides, when an abnormal measurement of vital sign is detected, a multiparameter monitor triggers audible alarms, and that alarm may not be prompted detected by the medical staff, depending on the workflow within an ICU. In that sense, this work proposes a knowledge representation model of monitoring of vital signs of patients in ICUs. The model proposed exploits the expressiveness and the formality of ontologies, rules and semantic web technologies. This promotes the consensual comprehension, the sharing and the reuse of vital signs of patients. The aim is to develop context-aware applications for monitoring human vital signs, including the storage, query support and semantic alarms triggering.Item Aceleração de uma variação do problema k-nearest neighbors(Universidade Federal de Goiás, 2014-01-29) Morais Neto, Jorge Peixoto de; Longo, Humberto José; Foulds, Leslie Richard; Martins, Wellington Santos; http://lattes.cnpq.br/3041686206689904; Longo, Humberto José; Rodrigues, Rosiane de Freitas; Silva, EdCarlos Domingos daLet M be a metric space and let P be a subset of M. The well known k-nearest neighbors problem (KNN) consists in finding, given q 2 M, the k elements of P with are closest to q according to the metric of M. We discuss a variation of KNN for a particular class of pseudo-metric spaces, described as follows. Let m 2 N be a natural number and let d be the Euclidean distance in Rm. Given p 2 Rm: p := (p1; : : : ; pm) let C (p) be the set of the m rotations of p’s coordinates: C (p) := f(p1; : : : ; pm); (p2; : : : ; pm; p1); : : : ; (pm; p1; : : : ; pm1)g we define the special distance de as: de(p;q) := min p02C (p) d(p0;q): de is a pseudo-metric, and (Rm;de) is a pseudo-metric space. The class of pseudo-metric spaces under discussion is f(Rm;de) j m 2 N:g The brute force approach is too costly for instances of practical size. We present a more efficient solution employing parallelism, the FFT (fast Fourier transform) and the fast elimination of unfavorable training vectors.We describe a program—named CyclicKNN —which implements this solution.We report the speedup of this program over serial brute force search, processing reference datasets.Item Acelerando a construção de tabelas hash para dados textuais com aplicações(Universidade Federal de Goiás, 2020-11-17) Barros, Chayner Cordeiro; Martins, Wellington Santos; http://buscatextual.cnpq.br/buscatextual/visualizacv.do?id=K4782112U1; Martins, Wellington Santos; Rosa, Thierson Couto; Sousa, Daniel Xavier deText mining is characterized by the extraction of information from textual data, in the most diverse formats, aiming at the knowledge production, classification, clusterization, translation of this information among other things. In order for text mining to be efficient, some procedures are performed on the data to ensure that it contains only content relevant to the analysis to be performed, and that it is structured in a format that is easier to manipulate computationally. Several pre-processing tasks must be performed on this data, in order to achieve the desired quality and representation. In this sense, the present work proposes an implementation of a hash table capable of efficiently exploring the high parallelism available in GPUs, as a way to increase the performance of pre- processing tasks. However, this work not only presents more efficient algorithms, but also demonstrates the feasibility of its use in applications such as the generation of the co- occurrence matrix and the representation of the text using embeddings.Item Adaptação transcultural, desenvolvimento e validação psicométrica da versão em Libras da Escala de Usabilidade do Sistema(Universidade Federal de Goiás, 2021-10-07) Oliveira, Luíla Moraes de; Chaveiro, Neuma; http://lattes.cnpq.br/1345257253831999; Rodrigues, Cássio Leonardo; http://lattes.cnpq.br/2590620617848677; Rodrigues, Cássio Leonardo; Figueiredo, Jorge César Abrantes de; Machado, Maria Istela CagninUsability is related to the subjective concept of ease of use and varies according to the user. The participation of deaf people in usability tests is still low. When it occurs, questionnaires in oral language are used, which are not easily accessible to this community. It is necessary to interact with the deaf community on its own terms, respecting its identity and culture, expressed even through the use of sign language. Libras is the official sign language in Brazil. This work carried out the cross-cultural adaptation of the usability assessment questionnaire SUS (System Usability Scale) for Libras. This translation was performed by a multidisciplinary team that included experts in Libras and Software Engineering, and underwent translation quality assessment through the back-translation method. But a recorded version of the SUS in Libras is not enough for the wide application of the instrument, thus, a test system was developed accessible in Libras and Portuguese, which obtained an overall usability score of 68.70 - considered high. This system was remotely tested by 76 volunteer participants, 55 deafs and 21 hearings. Of these, a mixed sample of 23 individuals performed tasks and answered the SUS-Português brasileiro. And a sample of 53 deafs signers and fluents in Libras performed the same tasks and answered the SUS-Libras. Both instruments were evaluated for psychometric properties of internal consistency and criterion validity. The SUS-Português brasileiro obtained a Cronbach's alpha coefficient of 0.93 and Spearman's correlation rs = 0.590 (n = 23, p < 0.001). The SUS-Libras, on the other hand, had a Cronbach's alpha coefficient of 0.72 and a Spearman's correlation of rs = 0.298 (n = 53, p < 0.05), both instruments demonstrated reliability and validity.Item Agente para suporte à decisão multicritério em gestão pública participativa(Universidade Federal de Goiás, 2014-09-26) Amorim, Leonardo Afonso; Patto, Vinicius Sebba; Bulcão Neto, Renato de Freitas; http://lattes.cnpq.br/5627556088346425; Bulcão Neto, Renato de Freitas; Sene Junior, Iwens Gervásio; Patto, Vinicius Sebba; Cruz Junior, Gelson daDecision making in public management is associated with a high degree of complexity due to insufficient financial resources to meet all the demands emanating from various sectors of society. Often, economic activities are in conflict with social or environmental causes. Another important aspect in decision making in public management is the inclusion of various stakeholders, eg public management experts, small business owners, shopkeepers, teachers, representatives of social and professional classes, citizens etc. The goal of this master thesis is to present two computational agents to aid decision making in public management as part of ADGEPA project: Miner Agent (MA) and Agent Decision Support (DSA). The MA uses data mining techniques and DSA uses multi-criteria analysis to point out relevant issues. The context in which this work fits is ADGEPA project. The ADGEPA (which means Digital Assistant for Participatory Public Management) is an innovative practice to support participatory decision making in public resources management. The main contribution of this master thesis is the ability to assist in the discovery of patterns and correlations between environmental aspects that are not too obvious and can vary from community to community. This contribution would help the public manager to make systemic decisions that in addition to attacking the main problem of a given region would decrease or solve other problems. The validation of the results depends on actual data and analysis of public managers. In this work, the data were simulated.Item Airetama Um Arcabouço Baseado em Sistemas Multiagentes para a Implantação de Comunidades Virtuais de Prática na Web(Universidade Federal de Goiás, 2010-10-04) ALARCÓN, Jair Abú Bechir Láscar; CARVALHO, Cedric Luiz de; http://lattes.cnpq.br/4090131106212286The objective of this dissertation is to present the framework Airetama. This framework is based on Multiagent Systems and Semantic Web principles. It provides a semantic, distributed and open-source infrastructure for the creation of Virtual Communities of Practice on the Web. It makes possible, through the use of agents, coupling of resources and tools that use semantic technologies. Integration of semantic in the current Web has as main objective to allow such software agents can use their pages more intelligently, thus offering better service.Item Algoritmo evolutivo com representação inteira para seleção de características(Universidade Federal de Goiás, 2017-04-20) Sousa, Rhelcris Salvino de; Soares, Anderson da Silva; http://lattes.cnpq.br/1096941114079527; Soares, Telma Woerle de Lima; http://lattes.cnpq.br/6296363436468330; Soares, Telma Woerle de Lima; http://lattes.cnpq.br/6296363436468330; Soares, Anderson da Silva; http://lattes.cnpq.br/1096941114079527; Camilo Junior , Celso Gonçalves; Dias , Jailson CardosoMachine learning problems usually involve a large number of features or variables. In this context, feature selection algorithms have the challenge of determining a reduced subset from the original set. The main difficulty in this task is the high number of solutions available in the search space. In this context, genetic algorithm is one of the most used techniques in this type of problem due to its implicit parallelism in the exploration of the search space of the problem considered. However, a binary type representation is usually used to encode the solutions. This work proposes an implementation solution that makes use of integer representation called intEA-MLR instead of binary. The integer representation optimizes the understanding of the data, as the features to be selected are represented by integer values, reducing the size of the chromosome used in the search process. The intEA-MLR in this context is presented as an alternative way of solving high dimensional problems in regression problems. As a case study, three different sets of data are used concerning problems involving determination of properties of interest in samples of 1) Grain Wheat, 2) Medicine tablets and 3) petroleum. Such sets were used in competitions held at the International Diffuse Reflectance Conference (IDRC) (http://cnirs.clubexpress.com/content.aspx?page_id=22&club_ id=409746&module_id=190211), in the years 2008, 2012 and 2014, respectively. The results showed that the proposed solution was able to improve the obtained solutions when compared to the classical implementation that makes use of binary coding, with both more accurate prediction models and with reduced number of features. IntEA-MLR also outperformed the competition winners, reaching 91.17% better than the competition winner for the petroleum data set. In addition, the results also indicated that the computation time required by the intEA-MLR is relatively smaller as more features are available.Item Algoritmo evolutivo de cromossomo duplo para calibração multivariada(Universidade Federal de Goiás, 2013-03-05) Santiago, Kelton de Sousa; Soares, Anderson da Silva; http://lattes.cnpq.br/1096941114079527; Soares, Anderson da Silva; Soares, Telma Woerle de Lima; Coelho, Clarimar Josésamples and variables selection simultaneously. The algorithmic methods combination for selecting samples and variables in the multivariate calibration aims to building an effective model for predicting the concentration of a certain interest property. As study case uses data acquired by a material analysis with near infrared waves (NIR) on wheat samples in order to estimate the proteins concentration. The algorithms for selection samples as the random number generator (RNG), KennardStone (KS), sample set partitioning based on joint X and Y (SPXY) were used in conjunction with successive projection algorithms (SPA) and partial least square algorithm (PLS) for selection of variables in order to obtain results that can be used for comparison basis with the proposed algorithm AGCD results obtained. The presented results by samples selection algorithms (GNA, KS and SPXY) were too close,butwhenusedtogetherwithvariableselectionalgorithms(SPAandPLS)theresults were better in RMSEP terms. TheAGCDachievedsignificantlybetterresultscomparedtotheotherstestedalgorithms, reaching an improvement of 97% in comparison with the KS algorithm and an improvement of 63% over SPXY-PLS algorithm, which further approached the AGCD results.Item Algoritmo evolutivo multi-objetivo de tabelas para seleção de variáveis em calibração multivariada(Universidade Federal de Goiás, 2014-04-08) Jorge, Carlos Antônio Campos; Soares, Anderso da Silva; http://lattes.cnpq.br/1096941114079527; Soares, Anderson da Silva; Coelho, Clarimar José; Delbem, Alexandre Cláudio BotazzoThis work proposes the use of a multi-objective evolutionary algorithm that makes use of subsets stored in a data structure called table in which the best individuals from each objective considered are preserved. This approach is compared in this work with the traditional mono-objective evolutionary algorithm (GA), classical algorithms (PLS and SPA) and another classic multi-objective algorithm (NSGA-II). As a case study, a multivariate calibration problem is presented which involves the prediction of protein concentration in samples of whole wheat from the spectrophotometric measurements. The results showed that the proposed formulation has a smaller prediction error when compared to the mono-objective formulation and with a lower number of variables. Finally,astudyofnoisesensitivityobtainedbythemulti-objectiveformulationshoweda better resultwhen compared tothe other classical algorithmforvariable selection.Item Algoritmo evolutivo multi-objetivo em tabelas para seleção de variáveis em classificação multivariada(Universidade Federal de Goiás, 2014-10-29) Ribeiro, Lucas de Almeida; Soares, Anderson da Silva; http://lattes.cnpq.br/1096941114079527; Soares, Anderson da Silva; Coelho, Clarimar José; Federson, Fernando MarquesThis work proposes the use of multi-objective evolutionary algorithm on tables (AEMT) for variable selection in classification problems, using linear discriminant analysis. The proposed algorithm aims to find minimal subsets of the original variables, robust classifiers that model without significant loss in classification ability. The results of the classifiers modeled by the solutions found by this algorithm are compared in this work to those found by mono-objective formulations (such as PLS, APS and own implementations of a Simple Genetic Algorithm) and multi-objective formulations (such as the simple genetic algorithm multi -objective - MULTI-GA - and the NSGA II). As a case study, the algorithm was applied in the selection of spectral variables for classification by linear discriminant analysis (LDA) of samples of biodiesel / diesel. The results showed that the evolutionary formulations are solutions with a smaller number of variables (on average) and a better error rate (average) and compared to the PLS APS. The formulation of the AEMT proposal with the fitness functions: medium risk classification, number of selected variables and number of correlated variables in the model, found solutions with a lower average errors found by the NSGA II and the MULTI-GA, and also a smaller number of variables compared to the multi-GA. Regarding the sensitivity to noise the solution found by AEMT was less sensitive than other formulations compared, showing that the AEMT is more robust classifiers. Finally shows the separation regions of classes, based on the dispersion of samples, depending on the selected one of the solutions AEMT, it is noted that it is possible to determine variables of regions split from the selected variables.Item Algoritmo genético compacto com dominância para seleção de variáveis(Universidade Federal de Goiás, 2017-04-20) Nogueira, Heber Valdo; Soares, Telma Woerle de Lima; http://lattes.cnpq.br/6296363436468330; Soares, Anderson da Silva; http://lattes.cnpq.br/1096941114079527; Soares, Anderson da Silva; Soares, Telma Woerle de Lima; Coelho , Clarimar José; Dias , Jailson CardosoThe features selection problem consists in to select a subset of attributes that is able to reduce computational processing and storage resources, decrease curse of dimensionality effects and improve the performance of predictive models. Among the strategies used to solve this type of problem, we highlight evolutionary algorithms, such as the Genetic Algorithm. Despite the relative success of the Genetic Algorithm in solving various types of problems, different improvements have been proposed in order to improve their performance. Such improvements focus mainly on population representation, search mechanisms, and evaluation methods. In one of these proposals, the Genetic Compact Algorithm (CGA) arose, which proposes new ways of representing the population and guide the search for better solutions. Applying this type of strategy to solve the problem of variable selection often involves overfitting. In this context, this work proposes the implementation of a version of the Compact Genetic Algorithm to minimize more than one objective simultaneously. Such algorithm makes use of the concept of Pareto dominance and, therefore, is called Genetic Algorithm Compacted with Dominance (CGAD). As a case study, to evaluate the performance of the proposed algorithm, AGC-D is combined with Multiple Linear Regression (MLR) to select variables to better predict protein concentration in wheat samples. The proposed algorithm is compared to CGA and the Mutation-based Compact Genetic Algorithm. The results indicate that the CGAD is able to select a small set of variables, reducing the prediction error of the calibration model, reducing the possibility of overfitting.Item Algoritmo paralelo para processamento de séries temporais de sensoriamento remoto com aplicação na classificação do uso e cobertura do solo(Universidade Federal de Goiás, 2021-03-31) Paiva, Roberto de Urzêda; Oliveira, Sávio Salvarino Teles de; http://lattes.cnpq.br/1905829499839846; Martins, Wellington Santos; http://lattes.cnpq.br/3041686206689904; Laureano, Gustavo Teodoro; Martins, Wellington Santos; Ferreira Júnior, Laerte Guimarães; Oliveira, Sávio Salvarino Teles deThe increase in satellite launches into Earth's orbit in recent years has generated a huge amount of remote sensing data. These data are used in automated classification approaches, generating land-use and landcover products for different landscapes around the world. Dynamic Time Warping (DTW) is a well-known computational method used to measure the similarity between time series, it has been explored in several algorithms for remote sensing time series analysis. These DTW-based algorithms are capable of generating similarity measures between the series and pre-established patterns, these measures can be used as metafeatures to increase the performance of classification models. However, DTW-based algorithms require a lot of computational resources and have a high execution time, which makes them difficult to use in large volumes of data. Attempting to avoid this limitation, this article presents a parallel and fully scalable solution to optimize the construction of meta-features through remote sensing time series. In addition, results of the application of the meta-features generated in the training and evaluation of classification models based on Random Forest are presented, allowing the evaluation of the impact of the use of metafeatures in the automated classification of land-use and land-cover. The results show that both approaches have led to improvements in execution time and accuracy when compared to traditional strategies and models.Item Algoritmos baseados em estratégia evolutiva para a seleção dinâmica de espectro em rádios cognitivos(Universidade Federal de Goiás, 2013-11-22) Barbosa, Camila Soares; Cardoso, Kleber Vieira; http://lattes.cnpq.br/0268732896111424; Cardoso, Kleber Vieira; Corrêa, Sand Luz; Camilo Junior, Celso Gonçalves; Santos, Aldri Luiz dosOne of the main challenges in Dynamic Spectrum Selection for Cognitive Radios is the choice of the frequency range for each transmission. This choice should minimize interference with legacy devices and maximize the discovering opportunities or white spaces. There are several solutions to this issue, and Reinforcement Learning algorithms are the most successful. Among them stands out the Q-Learning whose weak point is the parameterization, since adjustments are needed in order to reach successfully the proposed objective. In that sense, this work proposes an algorithm based on evolutionary strategy and presents the main characteristics adaptability to the environment and fewer parameters. Through simulation, the performance of the Q-Learning and the proposal of this work were compared in different scenarios. The results allowed to evaluate the spectral efficiency and the adaptability to the environment. The proposal of this work shows promising results in most scenarios.Item Algoritmos de aprendizado de máquina na predição e avaliação de evasão de clientes em ambiente de produção(Universidade Federal de Goiás, 2021-07-02) Oliveira, Breno; Soares, Anderson da Silva; http://lattes.cnpq.br/1096941114079527; Soares, Anderson da Silva; Soares, Telma Woerle de Lima; Sousa, Rafael TeixeiraThe development of machine learning solutions involves several well-established stages. However, scientific studies have a concentration on stages such as data engineering, model training, and performance evaluation metrics. The advent of machine learning solutions implementation in business environments at an unprecedented level inspires the revisiting of some problems previously mentioned in the literature, but little explored. Among them, monitoring and evaluating the deterioration of the solution over time. During machine learning models training, it is assumed that the data not seen by the model in production presents the same distribution as the data used during the training stage. However, production models can decrease/lose performance as data changes over time. This phenomenon is defined in the literature as concept deviation. In this context, this work proposes a methodology that uses Auto Machine Learning with data stream learning capable of mitigating eventual concept deviations that may arise in the models implemented in a production environment. Real data from a customer avoidance problem (Churn) of a large-circulation regional newspaper were used. Three machine learning models were implemented using two methodologies: the proposed methodology called autoML-DS and the reference methodology that makes use of conventional model retraining. The results showed that the reference methodology presents performance losses of the implemented models, while the autoML-DS has its predictive capacity preserved. AutoML-DS was able to adapt the models over time, without having to perform a complete retraining, keeping small variations in the error rate.Item Algoritmos de junção por similaridade sobre fluxo de dados(Universidade Federal de Goiás, 2020-07-21) Pacífico, Lucas Oliveira; Ribeiro, Leonardo Andrade; http://lattes.cnpq.br/4036932351063584; Ribeiro, Leonardo Andrade; Dorneles, Carina Friedrich; Leitão Junior, Plinio de SaIn today's Big Data era, data is generated and collected at high speed, which imposes strict performance and memory requirements for processing this data. Also, the presence of heterogeneity data demands the use of similarity operations, which are computationally more expensive. In this context, the present work investigates the problem of performing similarity join over a continuous stream of data represented by sets. The concept of temporal similarity is employed, where the similarity between two data items decreases with the distance in their arrival time. The proposed algorithms directly incorporates this concept to reduce the comparison of space and memory consumption. Moreover, a new technique based on the partial frequency of the data elements is presented to substantially reduce processing cost. Results of the experimental evaluation performed demonstrate that the techniques presented provide substantial performance gains and good memory usage.