Tamanho, montagem de novo e anotação do genoma de Dipteryx alata (Leguminosae)

Nenhuma Miniatura disponível

Data

2017-04-24

Título da Revista

ISSN da Revista

Título de Volume

Editor

Universidade Federal de Goiás

Resumo

In recent years there has been a rapid increase in the availability and quality of sequencing data and with this an explosion of projects of sequencing of the genomes of plants occurred. In this scenario, genomic analyzes have been characterized as efficient to generate genetic information on a large scale, including for non-model species. Dipteryx alata is a non-model tree species endemic to the Cerrado biome belonging to the Leguminosae family. The objectives of this work were to estimate the number of chromosomes and the size of the genome of D. alata, and also assemble and annotate sequences of the genomes organelles and nuclear of the species using Illumina sequencing data. The size of the genome of D. alata was estimated as 1C = 0.825 pg, which corresponds to a haploid genome of 807.2 MB with 2n = 16 chromosomes. Were assembled 275,709 nuclear genomic sequences with N50 equal to 1598, which corresponds to 355MB and 44% of the whole genome. In the nuclear sequences, 21,981 microsatellite regions were annotated, of which 49.3% had dinucleotide motifs, 42.7% trinucleotide motifs and 4% tetranucleotide motifs. Transposable elements (TEs) were found in 39.29% of the sequences analyzed, corresponding to 421,701 TEs. LTR retrotransposons (gypsy and copy) were the most abundant TEs in nuclear sequences. Were annotated 1,431 RNA genes non-translated into proteins, being 176 rRNAs, 189 tRNAs, 477 snRNAs, 8 snoRNAs, 466 miRNAs and 115 lncRNAs. Were annotated also 62,200 protein coding genes with an average size of 1,156 bp. The estimated number of mRNAs transcribed by the set of annotated nuclear genes was 160,450, of which 131,228 showed significant similarity with known sequences and 84,793 were classified functionally in the Gene Ontology terms. A total of 736,787 SNPs and 90,803 InDels were discovered in the nuclear sequences. A mean of 1 SNP was identified for each 189 bp of the genome and the ratio between the transition (Ts) and transversion (Tv) mutations was 1.58. A percentage of 46.5% of the SNPs occurs in the genic context and the effects of the SNPs were annotated mainly in exons and intergenic regions. Were assembled 110 KB of chloroplastid sequences with N50 of 2,384 bp and 327 KB of mitochondrial sequences with N50 of 1,784 bp. Were annotated genes of 3 rRNA, 13 tRNA, 6 miRNA and 20 lncRNA for the chloroplast and genes of 4 rRNA, 26 tRNA, 7 miRNA and 54 lncRNA for the mitochondria. For the chloroplast were predicted 20 protein coding genes with a mean size of 2,374 bp and for mitochondria were predicted 176 genes with a mean size of 1,279 bp. The estimated number of mRNAs transcribed by this gene set was 63 and 525 for chloroplast and mitochondria respectively. Were annotated 39 microsatellite regions and 4 TEs in the chloroplastid sequences and 158 microsatellite regions and 26 TEs in the mitochondrial sequences. This work, which can be considered one of the first genomic studies for Cerrado species, represents a great advance in the knowledge on the structure and organization of the D. alata genome. The obtained results open the way for further genetic and genomic investigation for the species.

Descrição

Citação

TAQUARY, A. M. A. Tamanho, montagem de novo e anotação do genoma de Dipteryx alata (Leguminosae). 2017. 137 f. Tese (Doutorado em Genética e Biologia Molecular) - Universidade Federal de Goiás, Goiânia, 2017.