genome annotation [sstart] [send] - how to get protein sequence from gene
1
0
Entering edit mode
12 months ago
danfarkas • 0

Hi,

I have a bacterial chromosome. I am struggling to understand how I can get protein sequences from genes annotated in the following way:

bacterial_chromosome_9_515

where sstart = 9 and send = 515

This is actually fine, as I can index the forward sequence using biopython:

faa = fasta[sstart:send].seq.translate(table=11)

However, when a gene is annotated in the reverse way, where sstart > send:

bacterial_chromosome_2423_1891

I am unsure how to get the corresponding protein sequence.

It would be much appreciated if someone could explain this.

Many thanks,

Dan

genome annotation • 501 views
ADD COMMENT
0
Entering edit mode
12 months ago

It means that the gene is on the negative/minus strand. So you need to compute the reverse complementary sequence before translating it to amino acids. Something like this

if  sstart > send:
    fasta[send:sstart].seq.revcom().translate(table=11)
else:
    fasta[sstart:send].seq.translate(table=11)
ADD COMMENT
0
Entering edit mode

Hi shenwei356,

Thanks for the clarification. That makes sense.

I also realised that the reason I was confused, is because I need to also figure out which frame the amino acid sequence is transcribed in, as the genes may not start with the right frame. Can you suggest what may be the best way to do this?

Daniel

ADD REPLY

Login before adding your answer.

Traffic: 2778 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6