Biopython: Change Sequence Id And Print Out New Sequence
1
0
Entering edit mode
10.1 years ago

I'm learning biopython and I'm having trouble with reassigning the seq_record.id variable for my sequence.

from Bio import SeqIO

for seq_record in SeqIO.parse(file, "fasta"):
        seq_record.id = name[0] #name[0] is the new ID
        printseq_record.id)
        SeqIO.write(seq_record, "/Users/bucephalus/Desktop/nc_pimps/raw/"+name[0]+".fas", "fasta")

When the program prints it prints what I want

12345-123
12346-123
...

But when I open my files the headers are like

>12345-123 gi|1906382|gb|K03455.1|HIVHXB2CG

HELP PLOX

Tusen takk

biopython python fasta • 6.5k views
ADD COMMENT
6
Entering edit mode
10.1 years ago
Neilfws 49k

Everything is as it should be.

In fasta format, the ID is the part immediately after the ">" and before a whitespace. So when you ask for seq_record.id, that is what you see.

When you write out a new fasta file, you get the ID plus whatever comes after the whitespace. The text after the whitespace is termed the description. It is not part of the ID.

ADD COMMENT
0
Entering edit mode

If you change just the description, then the fasta file contains the old id and then the new description. If both the old id and then the new description are changed identically, then you get the effect that you desire.

ADD REPLY

Login before adding your answer.

Traffic: 2705 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6