Rookie question. I have a tabulated .gff3 annotation file of human alternative events obtained here that looks like this (showing first record only):
chr1 A3SS gene 15796 16765 . - . ID=chr1:6470:6628:-@chr1:5805|5810:5659:-;Name=chr1:6470:6628:-@chr1:5805|5810:5659:-;gid=chr1:6470:6628:-@chr1:5805|5810:5659:-
I am trying to annotate the genes (getting an ID or name,...) in order to know if they are associated with certain disease states, based on the information contained in this annotation file.
I thought about extracting the chromosome localisation (columns 1, 4 and 5) and converting it to a VCF-style file that I could use with ANNOVAR or a similar program:
1 15796 16765 0 0
However, I am not sure of the meaning of the ID event (i.e. ID=chr1:6470:6628:-@chr1:5805|5810:5659:-
).
(a) Could someone explain the format of the ID?
(b) Are the yellow parts the chromosome localisation and could it be used to retrieve the gene name and other info?
(c) Is there a more straightforward way of annotating the gene based on this localisation?