Problem Adding Qualifiers To A Genbank File With Python
1
0
Entering edit mode
7.7 years ago

Hi guys:

I have this GenBank file:

LOCUS       sctg_0006_0001        172997 bp    DNA              UNK 01-JAN-1980
DEFINITION  sctg_0006_0001  length=172997
ACCESSION   sctg_0006_0001
VERSION     sctg_0006_0001
KEYWORDS    .
SOURCE      .
ORGANISM  .

FEATURES             Location/Qualifiers
CDS             <3..182
/note="ID=1_1;partial=10;start_type=Edge;rbs_motif=None;rbs_spacer=None;gc_cont=0.722;conf=99.97;score=35.07;cscore=31.85;sscore=3.22;rscore=0.00;uscore=0.00;tscore=3.22;"
CDS             372..1145
/note="ID=1_2;partial=00;start_type=ATG;rbs_motif=GGA/GAG/AGG;rbs_spacer=5-10bp;gc_cont=0.755;conf=100.00;score=149.21;cscore=143.89;sscore=5.32;rscore=-0.60;uscore=1.69;tscore=4.88;"
CDS[Many Many More]...


And as you can see it has the features and their respective location and a qualifier note. What I'm trying to do in to add a new qualifier called locus_tag to each CDS in this big file.

I have written this code, but I'm getting some problems:

from Bio import SeqIO
from Bio.Seq import Seq
from Bio.SeqFeature import SeqFeature, FeatureLocation
from Bio.SeqRecord import SeqRecord

annotation_handle = open("/Users/jcastrof/Desktop/prueba/prueba_str.gbk","rU")

for record in SeqIO.parse(annotation_handle,"genbank"):

a = len(record.features)

for_rast = open("/Users/k/Desktop/prueba/contig_for_rast.gbk","w")

for x in range(0, a):

locus_tag = {"locus_tag":"%s_%s" % record.id,x+1)}

new_record = (SeqFeature(qualifiers = locus_tag))

record.features.append(new_record)

SeqIO.write(record, for_rast, "genbank")
for_rast.close()


And I've got this error:

Traceback (most recent call last):
File "/Users/k/Desktop/add_tag_locus.py", line 32, in <module>
SeqIO.write(record, for_rast, "genbank")
File "/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages/Bio/SeqIO/__init__.py", line 426, in write
count = writer_class(fp).write_file(sequences)
File "/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages/Bio/SeqIO/Interfaces.py", line 254, in write_file
count = self.write_records(records)
File "/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages/Bio/SeqIO/Interfaces.py", line 239, in write_records
self.write_record(record)
File "/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages/Bio/SeqIO/InsdcIO.py", line 775, in write_record
self._write_feature(feature, rec_length)
File "/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages/Bio/SeqIO/InsdcIO.py", line 305, in _write_feature
assert feature.type, feature
AssertionError: type:
location: None
qualifiers:
Key: locus_tag, Value: sctg_0006_0001_1


What would you suggest? (please try to help me out :D ). Thanks!

biopython genbank feature • 4.4k views
0
Entering edit mode

I remember that using Artemis ( http://www.sanger.ac.uk/resources/software/artemis/) you can load your Genbank file and add qualifiers (such as locus_tag) to all or specifically filtered features (for example easily to all CDS). Maybe this could save you some work.

0
Entering edit mode

Thanks, I'll try it. But what I need is to do this for many files.

0
Entering edit mode

I think it will be fine for "many" as in 5 to 10, but if it's more around 200 you might have to get back to another solution.

4
Entering edit mode
7.7 years ago

Your code creates a new Feature with only your locus qualifier in it. The error message is because this new feature does not possess a type (like CDS), so it can't write it out to GenBank format.

It sounds from your description like you want to add a qualifier to CDS features, rather than making a new feature so want something like:

x = 0
final_features = []
for f in record.features:
if f.type == "CDS":
f.qualifiers["locus_tag"] = "%s_%s" % record.id, x+1)
x += 1
final_features.append(f)

record.features = final_features
with open("/Users/k/Desktop/prueba/contig_for_rast.gbk","w") as for_rast:
SeqIO.write(record, for_rast, "genbank")


Hope this helps

2
Entering edit mode

Tip: You can simplify the last two lines by just calling the write function with a filename instead of a handle.

Brad: Do you need the final_features list bit? Can't you remove that as you are editing the features in situ?

0
Entering edit mode

Peter, you're right on both accounts. For final_features, I was just trying to be explicit about the modification. I'm picking up that habit from working with immutable objects in Clojure. Your approach will work great as well and be shorter.

0
Entering edit mode

You can (I presume) still edit your answer if you want to. The clojure influence makes sense now you've explained that ;)

0
Entering edit mode

Yeah it worked!

Traffic: 2101 users visited in the last hour
FAQ
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.