Genbank file format is wrong?
1
0
Entering edit mode
8.8 years ago
eddie.im ▴ 140

Hello,

I'm having a hard time to understand the logic behind the Genbank file format. In special about the "complement" feature, which doesn't seems to make sense to me, can someone help me out?

When I'm reading through a full genbank genome file, there is some annotations like: complement(1..200).

So lets say that my genome has 1000bp, What I'm expecting is that I'm pulling the sequence from range 1 to 200 of the reverse complement of the sequence in the end of the file (which i assume that is the "plus" strand" like on the example bellow:

     <--real--->
1  ----------------------------------------------------- 1000

1000----------------------------------------------------- 1
<possible**>                           <-expected--->


**The other possible logic is that the ranges points to reverse complement (which would be 800..1000 of the reverse strand in my logic) , but it's not the case either.

But its seems that no matter if it's complement or not the sequence is always pulled from the plus strand directly, it's not even pulled then reverse complemented, it's just pulled directly from the plus strand even tough it's pointing to the reverse strand.

I must be missing something... can someone make this clear for me?

file genbank reverse • 3.1k views
2
Entering edit mode
8.8 years ago
Zhaorong ★ 1.4k

I don't quite get your question. But reading this may help: http://www.insdc.org/files/feature_table.html#3.4.3

0
Entering edit mode

Hi Zhaorong, thanks a lot! It helped me!