Question: IGV tracks gene annotation
0
gravatar for bitpir
22 months ago by
bitpir140
bitpir140 wrote:

Hi there, I'm trying to visualize a reference genome in IGV and annotate the genes using a custom made GFF file. The snapshot of the IGV looks something like this. I am particularly curious about the pink labeled track. The GFF for the pink track looks something like this:

NC_023010.2 glimmer cds 5011 3686 3.08 - 1 orf00005;

NC_023010.2 glimmer cds 5052 5264 0.82 + 2 orf00006;

NC_023010.2 glimmer cds 5637 6800 3.11 + 2 orf00007;

As you can see, orf00005 is labeled differently. Does anyone know why this is so? Is that partial gene?

Screen Shot 2018 05 04 at 4 27 59 PM

Thanks for the help!

igv gff gene • 1.1k views
ADD COMMENTlink modified 22 months ago by h.mon29k • written 22 months ago by bitpir140
1
gravatar for h.mon
22 months ago by
h.mon29k
Brazil
h.mon29k wrote:

Probably IGV doesn't like the Glimmer gff, as it does not conform to the specification:

Columns 4 & 5: "start" and "end"

[...] Start is always less than or equal to end.

For orf00005, start > end.

ADD COMMENTlink written 22 months ago by h.mon29k

Hmm, I don't think that's the problem because there are other orfs that go in reverse direction too (HSP_RS15385). I found this answer from another site (https://biology.stackexchange.com/questions/68431/clarification-on-refseq-genes-track-on-igv) The thinner line is supposed to be untranslated region. Now I have to figure out why it is so while other tracks are considered translated region.

ADD REPLYlink written 21 months ago by bitpir140
1

Orfs that "go in reverse direction" has nothing to do with the start and end coordinates, this is an indication of strand:

Column 7: "strand"

The strand of the feature. + for positive strand (relative to the landmark), - for minus strand, and . for features that are not stranded. In addition, ? can be used for features whose strandedness is relevant, but unknown.

The feature you indicated (HISP_RS15385) is on minus strand, as orf00005, hence both have left-facing arrows. However, if you look at the gff, its start coordinate is less than the end coordinate ( 36612 < 37661 ):

NC_023010.2 RefSeq  gene    36612   37661   .   -   .   ID=gene-HISP_RS15385;Dbxref=GeneID:23802828;Name=HISP_RS15385;gbkey=Gene;gene_biotype=protein_coding;locus_tag=HISP_RS15385;old_locus_tag=HISP_16005
NC_023010.2 Protein Homology    CDS 36612   37661   .   -   0ID=cds35;Parent=gene-HISP_RS15385;Dbxref=Genbank:WP_014030602.1,GeneID:23802828;Name=WP_014030602.1;gbkey=CDS;inference=COORDINATES: similar to AA sequence:RefSeq:WP_014030602.1;product=radical SAM protein;protein_id=WP_014030602.1;transl_table=11
ADD REPLYlink modified 21 months ago • written 21 months ago by h.mon29k

I see! Got it, thank you so much for pointing out. That's really weird that Glimmer GFF has that kind of format. I'll check again with the formatting with that file. Thanks!

ADD REPLYlink written 21 months ago by bitpir140
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1128 users visited in the last hour