How To Write Sequences To Fasta Format Using Seqio And Seqrecord
2
3
Entering edit mode
9.9 years ago
viv_bio ▴ 50

I've a set of sequences like :-

sequenceset = ['AGATAAGTTCACGTTACCAT', 'GTATT', 'TAGATGAAGCGGGATAGTCTTTTTCTGATATGCACTTATCAGTTCACTAGCAGT', 'ACTGAACGTGATTGATGAAGCT', 'ATCTA']

i converted them into SeqRecord Objects:-

for records in sequenceset:
my_seqs = SeqRecord(Seq(records,IUPAC.DNA), id = randomsequence)

now how can i write this to a file in fasta format ?

I tried :-

handle = open(file_location,"w")
for sequences in my_seqs:
SeqIO.write(sequences,handle,"fasta")

But this shows error :

AttributeError: 'str' object has no attribute 'id'
biopython • 25k views
ADD COMMENT
1
Entering edit mode

python is indentation specific, please fix your indentation in you code fragments, also what is
"randomsequence", is it defined elsewhere? I suggest to put the complete code, and put it in one block.

ADD REPLY
0
Entering edit mode

You're missing bits of code here (e.g. the import lines). As Michael wrote, putting a complete example is a good idea.

ADD REPLY
0
Entering edit mode

I've reverted the code back to randomsequence not having quotes, as it keeps the question logical in respect to the answer given by Dk.

ADD REPLY
4
Entering edit mode
9.9 years ago
David W 4.9k

The error message is telling you that the things that you have in my_seqs are strings, not SeqRecords. In any case, because you've assigned my_seqs in a for loop it would only contain the last sequence

If I understand you, you're trying to write all those sequence to a single fasta file?

If that's the case then note SeqIO.write() can take a list or a generator of SeqRecords so you should pass one of those. Here's a generator expression :

records = (SeqRecord(Seq(seq, generic_dna), str(index)) for index,seq in enumerate(sequence_set) )
SeqIO.write(records, file_location, "fasta")

Note you don't have to use a file handle in recent versions of Biopython (a string is fine) an you could create a list of SeqRecords to write to file using a for loop if you don't like generator or list expressions (I think they're neat, but others disagree)

records = []
for (index, seq) in enumerate(sequenceset):
    records.append(SeqRecord(Seq(seq, generic_dna), str(index))
ADD COMMENT
3
Entering edit mode
9.9 years ago

Did you put quotes around the id name?

So instead of:

my_seqs = SeqRecord(Seq(records,IUPAC.DNA), id = randomsequence)

Do this:

my_seqs = SeqRecord(Seq(records,IUPAC.DNA), id = "randomsequence")

WIthout quotes, randomsequence is referring to a variable.

ADD COMMENT

Login before adding your answer.

Traffic: 969 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6