Multi-records genbank to CDS
1
0
Entering edit mode
11 months ago
hazirliver ▴ 10

Hi! I have a file containing several genebank records written one after the other. I need to extract CDS (protein sequnce(/translation), /locus_tag, /inference, /product and contig id) from all contigs. How can i do it?
The input format looks like this enter image description here
And the result looks like this enter image description here


How can i do this?

CDS genbank biopython • 301 views
ADD COMMENT
0
Entering edit mode

Since you are analyzing data, it would be helpful if you make some effort to write a small script to read a file line by line and process it.

ADD REPLY
0
Entering edit mode
11 months ago
Joe 19k

To clarify, you want all proteins/products, from all the entries in the file?

If so, take a look here: https://warwick.ac.uk/fac/sci/moac/people/students/peter_cock/python/genbank2fasta/

ADD COMMENT
0
Entering edit mode

Yes, thanks! The code in this article was giving me an error, but this article got me on the right way to find the answer. I found the right solution using SeqIO.InsdcIO.GenBankCdsFeatureIterator.

ADD REPLY

Login before adding your answer.

Traffic: 2225 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6