Question

Genbank file format is wrong?

0

Entering edit mode

9.9 years ago

eddie.im ▴ 140

Hello,

I'm having a hard time to understand the logic behind the Genbank file format. In special about the "complement" feature, which doesn't seems to make sense to me, can someone help me out?

When I'm reading through a full genbank genome file, there is some annotations like: complement(1..200).

So lets say that my genome has 1000bp, What I'm expecting is that I'm pulling the sequence from range 1 to 200 of the reverse complement of the sequence in the end of the file (which i assume that is the "plus" strand" like on the example bellow:

     <--real--->
 1  ----------------------------------------------------- 1000

1000----------------------------------------------------- 1
    <possible**>                           <-expected--->

**The other possible logic is that the ranges points to reverse complement (which would be 800..1000 of the reverse strand in my logic) , but it's not the case either.

But its seems that no matter if it's complement or not the sequence is always pulled from the plus strand directly, it's not even pulled then reverse complemented, it's just pulled directly from the plus strand even tough it's pointing to the reverse strand.

I must be missing something... can someone make this clear for me?

Thanks in advance

genbank • 3.4k views

ADD COMMENT • link updated 12 months ago by Ram 43k • written 9.9 years ago by eddie.im ▴ 140

Ram · Answer 1 · 2014-05-23

2

Entering edit mode

9.9 years ago

Zhaorong ★ 1.4k

I don't quite get your question. But reading this may help: http://www.insdc.org/files/feature_table.html#3.4.3

ADD COMMENT • link updated 4.3 years ago by Ram 43k • written 9.9 years ago by Zhaorong ★ 1.4k

0

Entering edit mode

Hi Zhaorong, thanks a lot! It helped me!

ADD REPLY • link 9.9 years ago by eddie.im ▴ 140