Hi, I am a bit confused with this new gff3 file I am working on. The older one did not have UTRs. I am not able to follow some entries in the gff3 file. Bear with me if its silly! :)
The entry for one particular gene looks like this (I deleted other other entries that are unnecessary for my question): I used these abbrevations for easier reading. 3'UTR = threeprimeUTR 5'UTR = fiveprimeUTR
ch00 gene 8835428 883785 ID=gene:c00g009140.2 ch00 mRNA 8834528 8837855 ID=mRNA:c00g009140.2.1;Parent=gene:c00g009140.2 ch00 exon 8835428 8835483 ID=exon:c00g009126.96.36.199;Parent=mRNA:c00g009140.2.1 ch00 CDS 8835428 8835483 ID=CDS:c00g009188.8.131.52;Parent=mRNA:c00g009140.2.1 ch00 intron 8835484 8835596 ID=intron:c00g009184.108.40.206;**Parent=mRNA:c00g009130.2.1** ch00 intron 8835484 8835596 ID=intron:c00g009220.127.116.11;**Parent=mRNA:c00g009140.2.1** ch00 exon 8835597 8835646 ID=exon:c00g00918.104.22.168;**Parent=mRNA:c00g009130.2.1** ch00 3'UTR 8835597 8835646 ID=3'UTR:c00g00922.214.171.124;**Parent=mRNA:c00g009130.2.1** ch00 exon 8835597 8835650 ID=exon:c00g009126.96.36.199;Parent=mRNA:c00g009140.2.1
and so on... My question is that, from the GFF3 format specification, the parent must also be declared, i.e another mRNA having this ID c00g009130.2.1. If so, then are these entries with errors? (This gff3 file is a pre-release as well). If not, could you please explain the logic behind? In addition to the introns, there are also UTRs and exons with another parent. Is this an overlapping gene? I don't really follow what an overlapping gene means as well.. It would be great if someone could explain.
Hi Pablo, the fact that some people write GFF3-like files hardly can be held against the GFF3 specification, which is quite clearly defined: http://www.sequenceontology.org/gff3.shtml While the GTF2.2 specification is clearly defined, it is also considerably more constrained with respect to what it can represent.
In addition, sometimes you can infer 3'UTR and 5'UTR from the difference between CDS and exon marks.
Hi Pablo, Thanks for your reply. I just checked again and the other transcript was present as well. My question was about the absence of other transcript in parent definition actually. Thanks again!