Question: Multi-records genbank to CDS
0
gravatar for hazirliver
3 months ago by
hazirliver10
hazirliver10 wrote:

Hi! I have a file containing several genebank records written one after the other. I need to extract CDS (protein sequnce(/translation), /locus_tag, /inference, /product and contig id) from all contigs. How can i do it?
The input format looks like this enter image description here
And the result looks like this enter image description here


How can i do this?

genbank biopython cds • 132 views
ADD COMMENTlink modified 3 months ago • written 3 months ago by hazirliver10

Since you are analyzing data, it would be helpful if you make some effort to write a small script to read a file line by line and process it.

ADD REPLYlink written 3 months ago by husensofteng290
0
gravatar for Joe
3 months ago by
Joe18k
United Kingdom
Joe18k wrote:

To clarify, you want all proteins/products, from all the entries in the file?

If so, take a look here: https://warwick.ac.uk/fac/sci/moac/people/students/peter_cock/python/genbank2fasta/

ADD COMMENTlink modified 3 months ago • written 3 months ago by Joe18k

Yes, thanks! The code in this article was giving me an error, but this article got me on the right way to find the answer. I found the right solution using SeqIO.InsdcIO.GenBankCdsFeatureIterator.

ADD REPLYlink modified 3 months ago • written 3 months ago by hazirliver10
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1180 users visited in the last hour