INF - Instituto de Informática
URI Permanente desta comunidade
Navegar
Navegando INF - Instituto de Informática por Por Área do CNPQ "CIENCIAS EXATAS E DA TERRA::CIENCIA DA COMPUTACAO::TEORIA DA COMPUTACAO"
Agora exibindo 1 - 3 de 3
Resultados por página
Opções de Ordenação
Item Preditor híbrido de estruturas terciárias de proteínas(Universidade Federal de Goiás, 2023-08-10) Almeida, Alexandre Barbosa de; Soares, Telma Woerle de Lima; http://lattes.cnpq.br/6296363436468330; Soares , Telma Woerle de Lima; Camilo Junior , Celso Gonoalves; Vieira, Flávio Henrique Teles; Delbem, Alexandre Cláudio Botazzo; Faccioli, Rodrigo AntônioProteins are organic molecules composed of chains of amino acids and play a variety of essential biological functions in the body. The native structure of a protein is the result of the folding process of its amino acids, with their spatial orientation primarily determined by two dihedral angles (φ, ψ). This work proposes a new hybrid method for predicting the tertiary structures of proteins called hyPROT, combining techniques of Multi-objective Evolutionary Algorithm optimization (MOEA), Molecular Dynamics, and Recurrent Neural Networks (RNNs). The proposed approach investigates the evolutionary profile of dihedral angles (φ, ψ) obtained by different MOEAs during the minimization process of the objective function by dominance and energy minimization by molecular dynamics. This proposal is unprecedented in the protein prediction literature. The premise under investigation is that the evolutionary profile of dihedrals may be concealing relevant patterns about folding mechanisms. To analyze the evolutionary profile of angles (φ, ψ), RNNs were used to abstract and generalize the specific biases of each MOEA. The selected MOEAs were NSGAII, BRKGA, and GDE3, and the objective function investigated combines the potential energy from non-covalent interactions and the solvation energy. The results obtained show that the hyPROT was able to reduce the RMSD value of the best prediction generated by the MOEAs individually by at least 33%. Predicting new series for dihedral angles allowed for the formation of histograms, indicating the formation of a possible statistical ensemble responsible for the distribution of dihedrals (φ, ψ) during the folding processItem Estudo comparativo de comitês de sub-redes neurais para o problema de aprender a ranquear(Universidade Federal de Goiás, 2023-09-01) Ribeiro, Diogo de Freitas; Sousa, Daniel Xavier de; http://lattes.cnpq.br/4603724338719739; Rosa, Thierson Couto; http://lattes.cnpq.br/4414718560764818; Rosa, Thierson Couto; Sousa, Daniel Xavier de; Canuto, Sérgio Daniel Carvalho; Martins, Wellington SantosLearning to Rank (L2R) is a sub-area of Information Retrieval that aims to use machine learning to optimize the positioning of the most relevant documents in the answer ranking to a specific query. Until recently, the LambdaMART method, which corresponds to an ensemble of regression trees, was considered state-of-the-art in L2R. However, the introduction of AllRank, a deep learning method that incorporates self-attention mechanisms, has overtaken LambdaMART as the most effective approach for L2R tasks. This study, at issued, explored the effectiveness and efficiency of sub-networks ensemble as a complementary method to an already excellent idea, which is the self-attention used in AllRank, thus establishing a new level of innovation and effectiveness in the field of ranking. Different methods for forming sub-networks ensemble, such as MultiSample Dropout, Multi-Sample Dropout (Training and Testing), BatchEnsemble and Masksembles, were implemented and tested on two standard data collections: MSLRWEB10K and YAHOO!. The results of the experiments indicated that some of these ensemble approaches, specifically Masksembles and BatchEnsemble, outperformed the original AllRank in metrics such as NDCG@1, NDCG@5 and NDCG@10, although they were more costly in terms of training and testing time. In conclusion, the research reveals that the application of sub-networks ensemble in L2R models is a promising strategy, especially in scenarios where latency time is not critical. Thus, this work not only advances the state of the art in L2R, but also opens up new possibilities for improvements in effectiveness and efficiency, inspiring future research into the use of sub-networks ensemble in L2R.Item Junções por similaridade aproximadas em espaços vetoriais densos(Universidade Federal de Goiás, 2023-08-24) Santana , Douglas Rolins de; Santana; Ribeiro, Leonardo Andrade; http://lattes.cnpq.br/4036932351063584; Ribeiro, Leonardo Andrade; Bedo, Marcos Vinicius Naves; Martins, Wellington SantosSimilarity Join is an operation that returns pairs of objects whose similarity is greater than or equal to a specified threshold, and is essential for tasks such as cleaning, mining, and data integration. A common approach is to use data vector representations, such as the TFIDF method, and measure the similarity between vectors using the cosine function. However, computing the similarity for all pairs of vectors can be computationally prohibitive on large data sets. Traditional algorithms exploit the sparsity of vectors and apply filters to reduce the comparison space. Recently, advances in natural language processing have produced in semantically richer vectors, improving the results quality. However, these vectors have different characteristics from those generated by traditional methods, being dense and of high dimensionality. Preliminary experiments demonstrated that L2AP, the best known algorithm for similarity join, is not efficient for dense vector spaces. Due to the intrinsic characteristics of such vectors, approximate solutions based on specialized indices are predominant for dealing with large datasets. In this context, we investigate how to perform similarity joins using the Hierarchical Navigable Small World (HNSW), a state-of-the-art graph-based index designed for approximate k-nearest neighbor (kNN) queries. We explored the design space of possible solutions, ranging from top-end alternatives to HNSW to deeper integration of similarity join processing into this framework. The experiments carried out demonstrated accelerations of up to 2.48 and 3.47 orders of magnitude in relation to the exact method and the baseline approach, respectively, maintaining recovery rates close to 100%.