Trouble understanding "Join" operator of NCBI
1
0
Entering edit mode
3.4 years ago
Prasad ★ 1.6k

Hello,

The Join operator is used for backsplicing events/split genes. Could anyone please provide additional details on interpretation of "Join" in case of pgRNA of NC_003977.

In NC_003977, pgRNA (pre-genomic RNA; encodes core and polymerase) is denoted as

join(1820..3182,1..1932)

Why does pgRNA contain overlapping coordinates? Is it because it codes for 2 proteins and each set of coordinates indicates individual proteins?

When exported NC_003977 to gff3 format using "Send to | gff3" option, coordinates in "Join" operators are summed up causing them to exceed the genome size (refer image).

Is it a bug or intended behavior?

Is there documentation on how to interpret these types of coordinates?

NCBI gene gff3 • 985 views
ADD COMMENT
0
Entering edit mode
3.4 years ago

It is a circular genome and goes around the "origin".

Basically an oddity of representing a circular genome as a straight line. It goes past the genome, but that wraps back on.

ADD COMMENT
0
Entering edit mode

Yes, this is a circular genome and this gene goes past reference relative position (1), but 2 coordinates are overlapping making region 1820..1932 counted twice. Do we read this overlapping region only once when we make gene sequence linear?

How to interpret this in gff3 as coordinates exceeds the total genome size?

ADD REPLY
1
Entering edit mode

does not matter that it overlaps, it starts at a start codon and goes until the next stop codon. If that stop codon runs into the gene again that is fine, it is in a different reading frame now this produces different aminoacids.

the reason it goes past the genome size is to properly show you how long the feature is. A typical use case is to do an end-start to figure out the length of the feature. Since this goes around they want to capture that.

but you are correct in assuming that this may cause problems. Many tools expecting linear genomes will fail to properly operate on this particular GFF file.

ADD REPLY

Login before adding your answer.

Traffic: 2846 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6