Question: What is wrong with NCBI's gff file?
gravatar for rimgubaev
8 days ago by
rimgubaev60 wrote:

This is a part of gff file for Nicotiana tabacum (tobacco) genome fasta. I must say that the fact that mRNA, gene, 1st exon and 1st CDS start at the same location (2309) is quite confusing! Any ideas?

NW_015787321.1  Gnomon  gene    2309    5631    .   +   .   ID=gene43;Dbxref=GeneID:107775605;Name=LOC107775605;gbkey=Gene;gene=LOC107775605;gene_biotype=protein_coding
NW_015787321.1  Gnomon  mRNA    2309    5631    .   +   .   ID=rna56;Parent=gene43;Dbxref=GeneID:107775605,Genbank:XM_016595671.1;Name=XM_016595671.1;gbkey=mRNA;gene=LOC107775605
NW_015787321.1  Gnomon  exon    2309    3923    .   +   .   ID=id448;Parent=rna56;Dbxref=GeneID:107775605,Genbank:XM_016595671.1;gbkey=mRNA;gene=LOC107775605
NW_015787321.1  Gnomon  exon    3967    5133    .   +   .   ID=id449;Parent=rna56;Dbxref=GeneID:107775605,Genbank:XM_016595671.1;gbkey=mRNA;gene=LOC107775605
NW_015787321.1  Gnomon  exon    5273    5631    .   +   .   ID=id450;Parent=rna56;Dbxref=GeneID:107775605,Genbank:XM_016595671.1;gbkey=mRNA;gene=LOC107775605
NW_015787321.1  Gnomon  CDS 2309    3923    .   +   0   ID=cds47;Parent=rna56;Dbxref=GeneID:107775605,Genbank:XP_016451157.1;Name=XP_016451157.1;gbkey=CDS;gene=LOC107775605
NW_015787321.1  Gnomon  CDS 3967    5133    .   +   2   ID=cds47;Parent=rna56;Dbxref=GeneID:107775605,Genbank:XP_016451157.1;Name=XP_016451157.1;gbkey=CDS;gene=LOC107775605
NW_015787321.1  Gnomon  CDS 5273    5631    .   +   2   ID=cds47;Parent=rna56;Dbxref=GeneID:107775605,Genbank:XP_016451157.1;Name=XP_016451157.1;gbkey=CDS;gene=LOC107775605
gff • 113 views
ADD COMMENTlink modified 7 days ago • written 8 days ago by rimgubaev60

I'd recommend you add (to your post by editing it) the common name of the organism as well as why you think what you're seeing is weird.

ADD REPLYlink modified 8 days ago • written 8 days ago by Ram17k
gravatar for Damian Kao
8 days ago by
Damian Kao15k
Damian Kao15k wrote:

Gene, mRNA and Exon starting with the same coordinate is normal. The CDS features starting at the same coordinate suggest that the 5'UTR wasn't annotated for this gene. This is probably due to this gene being computationally predicted rather than annotated with biological evidence (Ie. RNA-seq).

ADD COMMENTlink modified 8 days ago • written 8 days ago by Damian Kao15k

Now it's clear, thanks!

ADD REPLYlink written 7 days ago by rimgubaev60
Please log in to add an answer.


Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1573 users visited in the last hour