Entering edit mode
6.8 years ago
grant.hovhannisyan
★
2.6k
Hi Biostars, I think I have a very basic question, for which however could not find an answer in web.
I need to read fasta using SeqIO, change sequence for every record, and then write new records to file using again SeqIO.
What I do is:
from Bio import SeqIO
import re
sequences=[]
for record in SeqIO.parse("test.fasta", "fasta") :
record.seq = re.sub('[^GATC]', "", str(record.seq).upper())
sequences.append(record)
SeqIO.write(sequences,"output.fasta","fasta")
I get multiple errors running this code:
Traceback (most recent call last):
File "script.py", line 15, in <module>
SeqIO.write(sequences,"output.fasta","fasta")
File "/home/ghovhannisyan/Software/anaconda2/lib/python2.7/site-packages/Bio/SeqIO/__init__.py", line 481, in write
count = writer_class(fp).write_file(sequences)
File "/home/ghovhannisyan/Software/anaconda2/lib/python2.7/site-packages/Bio/SeqIO/Interfaces.py", line 209, in write_file
count = self.write_records(records)
File "/home/ghovhannisyan/Software/anaconda2/lib/python2.7/site-packages/Bio/SeqIO/Interfaces.py", line 194, in write_records
self.write_record(record)
File "/home/ghovhannisyan/Software/anaconda2/lib/python2.7/site-packages/Bio/SeqIO/FastaIO.py", line 202, in write_record
data = self._get_seq_string(record) # Catches sequence being None
File "/home/ghovhannisyan/Software/anaconda2/lib/python2.7/site-packages/Bio/SeqIO/Interfaces.py", line 100, in _get_seq_string
% record.id)
TypeError: SeqRecord (id=TCONS_00000001) has an invalid sequence.
I can bypass SeqIO.write method by parsing list and creating file, but I am wondering what is the problem here. Thanks
In order to provide closure for this thread it may be best to include @Peter's solution as an answer below (with the link attribution above) and then accept that answer.