INF - Instituto de Informática
URI Permanente desta comunidade
Navegar
Navegando INF - Instituto de Informática por Por Unidade Acadêmica "Instituto de Informática - INF (RG)"
Agora exibindo 1 - 5 de 5
Resultados por página
Opções de Ordenação
Item Preditor híbrido de estruturas terciárias de proteínas(Universidade Federal de Goiás, 2023-08-10) Almeida, Alexandre Barbosa de; Soares, Telma Woerle de Lima; http://lattes.cnpq.br/6296363436468330; Soares , Telma Woerle de Lima; Camilo Junior , Celso Gonoalves; Vieira, Flávio Henrique Teles; Delbem, Alexandre Cláudio Botazzo; Faccioli, Rodrigo AntônioProteins are organic molecules composed of chains of amino acids and play a variety of essential biological functions in the body. The native structure of a protein is the result of the folding process of its amino acids, with their spatial orientation primarily determined by two dihedral angles (φ, ψ). This work proposes a new hybrid method for predicting the tertiary structures of proteins called hyPROT, combining techniques of Multi-objective Evolutionary Algorithm optimization (MOEA), Molecular Dynamics, and Recurrent Neural Networks (RNNs). The proposed approach investigates the evolutionary profile of dihedral angles (φ, ψ) obtained by different MOEAs during the minimization process of the objective function by dominance and energy minimization by molecular dynamics. This proposal is unprecedented in the protein prediction literature. The premise under investigation is that the evolutionary profile of dihedrals may be concealing relevant patterns about folding mechanisms. To analyze the evolutionary profile of angles (φ, ψ), RNNs were used to abstract and generalize the specific biases of each MOEA. The selected MOEAs were NSGAII, BRKGA, and GDE3, and the objective function investigated combines the potential energy from non-covalent interactions and the solvation energy. The results obtained show that the hyPROT was able to reduce the RMSD value of the best prediction generated by the MOEAs individually by at least 33%. Predicting new series for dihedral angles allowed for the formation of histograms, indicating the formation of a possible statistical ensemble responsible for the distribution of dihedrals (φ, ψ) during the folding processItem Alocação de recursos e posicionamento de funções virtualizadas em redes de acesso por rádio desagregadas(Universidade Federal de Goiás, 2023-08-30) Almeida, Gabriel Matheus Faria de; Pinto, Leizer de Lima; http://lattes.cnpq.br/0611031507120144; Cardoso, Kleber Vieira; http://lattes.cnpq.br/0268732896111424; Cardoso, Kleber Vieira; Pinto, Leizer de Lima; Klautau Júnior, Aldebaro Barreto da Rocha; Silva, Luiz Antonio Pereira daJointly choosing a functional split of the protocol stack and placement of network functions in a virtualized RAN is critical to efficiently using the access network resources. This problem represents a current research topic in 5G and Post-5G networks, which involves the challenge of simultaneously choosing the placement of virtualized functions, the routes for traffic and the management of available computing resources. In this work, we present three approaches to solve this problem considering the planning scenario and two approaches considering the network operation scenario. The first result is a Mixed Integer Linear Programming (MILP) model, considering a generic set of processing nodes and multipath routing. The second approach uses artificial intelligence and machine learning concepts, in which we formulate a deep reinforcement learning agent. The third approach used is based on search meta-heuristics, through a genetic algorithm. The last two approaches are Markov Decision Process (MDP) formulations that consider dynamic demand on radio units. In all formulations, the objective is to maximize the network function’s centralization while minimizing positioning cost. Analysis of the solutions and comparison of their results show that exact approaches such as MILP naturally provide the best solution. However, in terms of efficiency, the genetic algorithm has the best search time, finding a high quality solution in a few seconds. The deep reinforcement learning agent presents a high convergence, finding high quality solutions for the problem and showing problem generalization capacity with different topologies. Finally, the formulations considering the network operation scenario with dynamic demand are highly complex due to the size of the action spaceItem Uma estratégia de pós-processamento para seleção de regras de associação para descoberta de conhecimento(Universidade Federal de Goiás, 2023-08-22) Cintra, Luiz Fernando da Cunha; Salvini, Rogerio Lopes; http://lattes.cnpq.br/5009392667450875; Salvini, Rogerio Lopes; Rosa, Thierson Couto; Aguilar Alonso, Eduardo JoséAssociation rule mining (ARM) is a traditional data mining method that provides information about associations between items in transactional databases. A known problem of ARM is the large amount of rules generated, thus requiring approaches to post-process these rules so that a human expert is able to analyze the associations found. In some contexts the domain expert is interested in investigating only one item of interest, in these cases a search guided by the item of interest can help to mitigate the problem. For an exploratory analysis, this implies looking for associations in which the item of interest appears in any part of the rule. Few methods focus on post-processing the generated rules targeting an item of interest. The present work seeks to highlight the relevant associations of a given item in order to bring knowledge about its role through its interactions and relationships in common with the other items. For this, this work proposes a post-processing strategy of association rules, which selects and groups rules oriented to a certain item of interest provided by an expert of a domain of knowledge. In addition, a graphical form is also presented so that the associations between rules and groupings of rules found are more easily visualized and interpreted. Four case studies show that the proposed method is admissible and manages to reduce the number of relevant rules to a manageable amount, allowing analysis by domain experts. Graphs showing the relationships between the groups were generated in all case studies and facilitate their analysis.Item Junções por similaridade aproximadas em espaços vetoriais densos(Universidade Federal de Goiás, 2023-08-24) Santana , Douglas Rolins de; Santana; Ribeiro, Leonardo Andrade; http://lattes.cnpq.br/4036932351063584; Ribeiro, Leonardo Andrade; Bedo, Marcos Vinicius Naves; Martins, Wellington SantosSimilarity Join is an operation that returns pairs of objects whose similarity is greater than or equal to a specified threshold, and is essential for tasks such as cleaning, mining, and data integration. A common approach is to use data vector representations, such as the TFIDF method, and measure the similarity between vectors using the cosine function. However, computing the similarity for all pairs of vectors can be computationally prohibitive on large data sets. Traditional algorithms exploit the sparsity of vectors and apply filters to reduce the comparison space. Recently, advances in natural language processing have produced in semantically richer vectors, improving the results quality. However, these vectors have different characteristics from those generated by traditional methods, being dense and of high dimensionality. Preliminary experiments demonstrated that L2AP, the best known algorithm for similarity join, is not efficient for dense vector spaces. Due to the intrinsic characteristics of such vectors, approximate solutions based on specialized indices are predominant for dealing with large datasets. In this context, we investigate how to perform similarity joins using the Hierarchical Navigable Small World (HNSW), a state-of-the-art graph-based index designed for approximate k-nearest neighbor (kNN) queries. We explored the design space of possible solutions, ranging from top-end alternatives to HNSW to deeper integration of similarity join processing into this framework. The experiments carried out demonstrated accelerations of up to 2.48 and 3.47 orders of magnitude in relation to the exact method and the baseline approach, respectively, maintaining recovery rates close to 100%.Item Classificação das despesas com pessoal no contexto dos Tribunais de Contas(Universidade Federal de Goiás, 2023-08-22) Teixeira, Pedro Henrique; Silva, Nadia Félix Felipe da; http://lattes.cnpq.br/7864834001694765; Salvini, Rogerio Lopes; http://lattes.cnpq.br/5009392667450875; Salvini, Rogerio Lopes; Silva, Nadia Félix Felipe da; Fernandes, Deborah Silva Alves; Costa, Nattane Luíza daThe Court of Accounts of the Municipalities of the State of Goiás (TCMGO) uses the expenditure data received monthly from the municipalities of Goiás to check the expenditure related to personnel expenses, as determined by LRF. However, there are indications that the classification of expenses sent by the municipal manager may contain inconsistencies arising from fiscal tricks, creative accounting or material errors, leading TCMGO to make decisions based on incorrect reports, resulting in serious consequences for the inspection process. As a way of dealing with this problem, this work used text classification techniques to identify, based on the description of the expense and instead of the code provided by the municipality, the class of a personnel expense. For this, a corpus was built with 17,116 expense records labeled by domain experts, using binary and multi-class approaches. Data processing procedures were applied to extract attributes from the textual description, as well as assign numerical values to each instance of the data set with the TF-IDF algorithm. In the modeling stage, the algorithms Multinomial Naïve Bayes, Logistic Regression and Support Vector Machine (SVM) were used in supervised classification. SVM proved to be the best algorithm, with F-Score of 0.92 and 0.97, respectively, on the multi-class and binary corpus. However, it was found that the labeling process carried out by human experts is complex, time-consuming and expensive. Therefore, this work developed a method to classify personnel expenses using only 235 labeled samples, improved by unlabeled instances, based on the adaptation of the Self-Training algorithm, producing very promising results, with an average F-Score between 0.86 and 0.89.