Question: Just complement or reverse sequence fom Biopython, but not reverse-complement one!
0
gravatar for natasha.sernova
11 months ago by
natasha.sernova2.5k
natasha.sernova2.5k wrote:

Dear all,

I have a problem with Biopython.

I give it a fasta-sequence and need to make either reversed fasta sequence out of it in a separate output file, or just complement fasta-sequence in a separate output file.

The four lines below were taken from Biopython cookbook, and the script works perfectly well.

from Bio import SeqIO

records = (rec.reverse_complement(id="rc_"+rec.id, description = "reverse complement") \

for rec in SeqIO.parse("example.fasta", "fasta") if len(rec)<700)

SeqIO.write(records, "rev_comp.fasta", "fasta")

18

But my goal is fasta-file with just complement to the initial sequence, or reverse sequence to the initial fasta sequence.

So far I have not found any way to get them in Biopython.

I can open final rev_comp.fasta, copy the sequence itself, do something like that:

my_seq[::-1] and print the result to a new fasta-file, it will give me just complement sequence.

But it won't be a pure Biopython application, and a little bit more complicated with reverse sequence only.

Is there any simple way to do it in Biopython only? Actually I don't need reverse complement sequence at all, just either reverse or complement fasta sequences for my initial fasta-sequence.

Thank you very much!

Natasha

biopython complement reverse • 540 views
ADD COMMENTlink modified 11 months ago by Vijay Lakhujani1.3k • written 11 months ago by natasha.sernova2.5k

Please use code formatting for readability (101010 button), I modified your post for now.

ADD REPLYlink written 11 months ago by WouterDeCoster23k

But the code looks weird to me, I don't think this can work.

EDIT: I was wrong :)

ADD REPLYlink modified 11 months ago • written 11 months ago by WouterDeCoster23k

This code is not mine,

see http://biopython.org/DIST/docs/tutorial/Tutorial.html

chapter 5.5.3

Honestly I didn't expect it would work, but it did.

ADD REPLYlink written 11 months ago by natasha.sernova2.5k

Oh right, now I see it, it is a set comprehension split over two lines. Nevermind :-)

ADD REPLYlink written 11 months ago by WouterDeCoster23k
3
gravatar for Vijay Lakhujani
11 months ago by
Vijay Lakhujani1.3k
India
Vijay Lakhujani1.3k wrote:

Here you go:

USAGE: python complement.py -i input_fasta -o output_fasta

from Bio import SeqIO
import getopt, sys



def usage():
    print "Usage: complement.py -i <input_fasta> -o <output_fasta>"


try:
    options, remainder=getopt.getopt(sys.argv[1:], 'i:o:h')

except getopt.GetoptError as err:
    print str(err)
    usage()
    sys.exit()

for opt, arg in options:
    if opt in ('-i'):
        input_file=arg
    if opt in ('-h'):
        usage()
    sys.exit()
    elif opt in ('-o'):
        output_file=arg


out=open(output_file, 'w')

for record in SeqIO.parse(input_file, "fasta"):
    out.write( ">"+record.id+"\n"+str(record.seq.complement())+"\n" )
out.close()

Let me know if it works & don't forget to upvote ;)

ADD COMMENTlink modified 11 months ago • written 11 months ago by Vijay Lakhujani1.3k

Very nice! Thank you! It does't like line 24 - line 23 is out of place, isn't it? I commented it, and everything worked.

But single reverse case will be just

out.write( ">"+record.id+"\n"+str(record.seq.reverse())+"\n" )

or something more complicated?

ADD REPLYlink written 11 months ago by natasha.sernova2.5k
1

Strange, nothing is out of place, luckily it worked for you as commenting line 23 and 24 would affect only if you forget to provide the input/output file names.

Reverse would be :

out.write( ">"+record.id+"\n"+str(record.seq)[::-1]+"\n" )

Warning: untested - just try.

ADD REPLYlink written 11 months ago by Vijay Lakhujani1.3k

Perfect, it did work! Thank you very much!!!

ADD REPLYlink written 11 months ago by natasha.sernova2.5k
0
gravatar for Vijay Lakhujani
11 months ago by
Vijay Lakhujani1.3k
India
Vijay Lakhujani1.3k wrote:

For complement, there is already a function- example from the cookbook

>>> from Bio.Seq import Seq
>>> from Bio.Alphabet import generic_dna
>>> my_dna = Seq("AGTACACTGGT", generic_dna)
>>> my_dna
Seq('AGTACACTGGT', DNAAlphabet())
>>> my_dna.complement()
Seq('TCATGTGACCA', DNAAlphabet())

To reverse a string, just use this trick for string reversal

seq[::-1]

where seq is your sequence string. If you have seq object, str(seq)[::-1] should work. See below example

>>> from Bio.Seq import Seq
>>> from Bio.Alphabet import generic_dna
>>> my_dna = Seq("AGTACACTGGT", generic_dna)
>>> type(my_dna)
<class 'Bio.Seq.Seq'>
>>> str(my_dna)
'AGTACACTGGT'
>>> str(my_dna)[::-1]
'TGGTCACATGA'
ADD COMMENTlink modified 11 months ago • written 11 months ago by Vijay Lakhujani1.3k

Thank you, Vijay! Initially I used Bio.Seq as well.

But I have a fasta-file wth header and nucleotide sequence, then I should make something with it by Biopython and get

another fasta-file with a complement sequence, let's forget about a reverse one right now.

ADD REPLYlink written 11 months ago by natasha.sernova2.5k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1012 users visited in the last hour