Question

Just complement or reverse sequence fom Biopython, but not reverse-complement one!

1

Entering edit mode

8.6 years ago

natasha.sernova ★ 4.0k

Dear all,

I have a problem with Biopython.

I give it a fasta-sequence and need to make either reversed fasta sequence out of it in a separate output file, or just complement fasta-sequence in a separate output file.

The four lines below were taken from Biopython cookbook, and the script works perfectly well.

from Bio import SeqIO

records = (rec.reverse_complement(id="rc_"+rec.id, description = "reverse complement") \

for rec in SeqIO.parse("example.fasta", "fasta") if len(rec)<700)

SeqIO.write(records, "rev_comp.fasta", "fasta")

18

But my goal is fasta-file with just complement to the initial sequence, or reverse sequence to the initial fasta sequence.

So far I have not found any way to get them in Biopython.

I can open final rev_comp.fasta, copy the sequence itself, do something like that:

my_seq[::-1] and print the result to a new fasta-file, it will give me just complement sequence.

But it won't be a pure Biopython application, and a little bit more complicated with reverse sequence only.

Is there any simple way to do it in Biopython only? Actually I don't need reverse complement sequence at all, just either reverse or complement fasta sequences for my initial fasta-sequence.

Thank you very much!

Natasha

Biopython reverse complement • 5.9k views

ADD COMMENT • link updated 8.6 years ago by lakhujanivijay 5.9k • written 8.6 years ago by natasha.sernova ★ 4.0k

1

Entering edit mode

Please use code formatting for readability (101010 button), I modified your post for now.

ADD REPLY • link 8.6 years ago by WouterDeCoster 48k

1

Entering edit mode

But the code looks weird to me, I don't think this can work.

EDIT: I was wrong :)

ADD REPLY • link 8.6 years ago by WouterDeCoster 48k

0

Entering edit mode

This code is not mine,

see http://biopython.org/DIST/docs/tutorial/Tutorial.html

chapter 5.5.3

Honestly I didn't expect it would work, but it did.

ADD REPLY • link 8.6 years ago by natasha.sernova ★ 4.0k

1

Entering edit mode

Oh right, now I see it, it is a set comprehension split over two lines. Nevermind :-)

ADD REPLY • link 8.6 years ago by WouterDeCoster 48k

0

Entering edit mode

8.6 years ago

lakhujanivijay 5.9k

For complement, there is already a function- example from the cookbook

>>> from Bio.Seq import Seq
>>> from Bio.Alphabet import generic_dna
>>> my_dna = Seq("AGTACACTGGT", generic_dna)
>>> my_dna
Seq('AGTACACTGGT', DNAAlphabet())
>>> my_dna.complement()
Seq('TCATGTGACCA', DNAAlphabet())

To reverse a string, just use this trick for string reversal

seq[::-1]

where seq is your sequence string. If you have seq object, str(seq)[::-1] should work. See below example

>>> from Bio.Seq import Seq
>>> from Bio.Alphabet import generic_dna
>>> my_dna = Seq("AGTACACTGGT", generic_dna)
>>> type(my_dna)
<class 'Bio.Seq.Seq'>
>>> str(my_dna)
'AGTACACTGGT'
>>> str(my_dna)[::-1]
'TGGTCACATGA'

ADD COMMENT • link 8.6 years ago by lakhujanivijay 5.9k

1

Entering edit mode

Thank you, Vijay! Initially I used Bio.Seq as well.

But I have a fasta-file wth header and nucleotide sequence, then I should make something with it by Biopython and get

another fasta-file with a complement sequence, let's forget about a reverse one right now.

ADD REPLY • link 8.6 years ago by natasha.sernova ★ 4.0k

score 4 · Accepted Answer · 2016-11-24

4

Entering edit mode

8.6 years ago

lakhujanivijay 5.9k

Here you go:

USAGE: python complement.py -i input_fasta -o output_fasta

from Bio import SeqIO
import getopt, sys



def usage():
    print "Usage: complement.py -i <input_fasta> -o <output_fasta>"


try:
    options, remainder=getopt.getopt(sys.argv[1:], 'i:o:h')

except getopt.GetoptError as err:
    print str(err)
    usage()
    sys.exit()

for opt, arg in options:
    if opt in ('-i'):
        input_file=arg
    if opt in ('-h'):
        usage()
    sys.exit()
    elif opt in ('-o'):
        output_file=arg


out=open(output_file, 'w')

for record in SeqIO.parse(input_file, "fasta"):
    out.write( ">"+record.id+"\n"+str(record.seq.complement())+"\n" )
out.close()

Let me know if it works & don't forget to upvote ;)

ADD COMMENT • link 8.6 years ago by lakhujanivijay 5.9k

0

Entering edit mode

Very nice! Thank you! It does't like line 24 - line 23 is out of place, isn't it? I commented it, and everything worked.

But single reverse case will be just

out.write( ">"+record.id+"\n"+str(record.seq.reverse())+"\n" )

or something more complicated?

ADD REPLY • link 8.6 years ago by natasha.sernova ★ 4.0k

1

Entering edit mode

Strange, nothing is out of place, luckily it worked for you as commenting line 23 and 24 would affect only if you forget to provide the input/output file names.

Reverse would be :

out.write( ">"+record.id+"\n"+str(record.seq)[::-1]+"\n" )

Warning: untested - just try.

ADD REPLY • link 8.6 years ago by lakhujanivijay 5.9k

0

Entering edit mode

Perfect, it did work! Thank you very much!!!

ADD REPLY • link 8.6 years ago by natasha.sernova ★ 4.0k