Parsing Genbank File: Get Locus Tag Vs Product
1
0
Entering edit mode
10.2 years ago
biotech ▴ 570

So that's all, just get this. Is there any already build perl/python module that could do this?

Thanks

bioperl biopython • 5.2k views
ADD COMMENT
1
Entering edit mode

Could you improve your question by giving an example input file (e.g. URL to an NCBI GenBank file) and the desired output (e.g. first few lines), since this is not clear. That would explain what you mean by product - which might be protein description, amino acid sequence, etc.

ADD REPLY
0
Entering edit mode

Hi Peter, thanks for you reply. Check my question in stack forums, I also posted there. I'm using Bio::GenBankParser module, as suggested by @TLP. It's giving me some issues but seems to fit my needs at the present time. http://stackoverflow.com/questions/22067785/parsing-genbank-file-get-locus-tag-vs-product

ADD REPLY
0
Entering edit mode

I don't think you got the most useful advice from SO. That module is an attempt to improve on something that works fine. Stick with the better supported, tried and tested original from BioPerl. Start with the Bio::SeqIO HOWTO and the Feature Annotation HOWTO.

ADD REPLY
0
Entering edit mode

Hi Neil, I'll dig a little more into BioPerl features, still very new for me. Thanks for your reply.

ADD REPLY
1
Entering edit mode
10.2 years ago
Peter 6.0k

Having read your question on StackOverflow (please don't double post like this), here's a minimal Biopython answer:

import sys
from Bio import SeqIO
filename = sys.argv[1] # Takes first command line argument input filename
for record in SeqIO.parse(filename, "genbank"):
    for feature in record.features:
        if feature.type == "CDS":
            locus_tag = feature.qualifiers.get("locus_tag", ["???"])[0]
            product = feature.qualifiers.get("product", ["???"])[0]
            print("%s\t%s" % (locus_tag, product))

With minor changes you can write this out to a file instead.

ADD COMMENT

Login before adding your answer.

Traffic: 1974 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6