Question

Adding Domains To Cds In Genbank Format

0

Entering edit mode

11.9 years ago

Lee Katz ★ 3.2k

Hi, I am wondering what the correct way is for adding protein domains to CDS entries in a Genbank file.

InterProScan annotates a CDS and tells me where each protein domain might be (or it might annotate the whole gene, but that's not an issue for me). If I have a CDS from coordinate 330 to coordinate 1178, and a domain is found at 342..1170, and a second domain is found at 348..1164, then how is this shown in the Genbank file? And even easier, is there a way to simply do it with BioPerl?

I am currently doing it like such, but when I load it into the Apollo genome viewer which is my benchmark for correctness, it doesn't look exactly right. It just groups everything into one misc_feature in the interface, with all features combined.

Thank you for your help!

LOCUS       NODE_80_length_3830_cov_32.131855         3952 bp    dna     linear   UNK
ACCESSION   unknown
FEATURES             Location/Qualifiers
     source          1..3952
                     /mol_type="genomic DNA"
                     /project="K5661"
                     /organism="XXXXXX"
     gene            330..1178
                     /locus_tag="K5661_draft_3226"
     CDS             330..1178
                     /locus_tag="K5661_draft_3226"
                     /product="Sulfate-binding protein sbp"
     misc_feature    342..1170
                     /locus_tag="K5661_draft_3226"
                     /evalue="1.2e-71"
                     /database_name="SUPERFAMILY"
                     /status="T"
                     /evidence=superfamily
                     /product="Sulfate-binding protein sbp"
                     /product="Periplasmic binding protein-like II"
                     /accession_num="SSF53850"
     misc_feature    348..1164
                     /locus_tag="K5661_draft_3226"
                     /evalue="1.9e-131"
                     /database_name="TIGRFAMs"
                     /status="T"
                     /evidence=HMMTigr
                     /product="Sulfate-binding protein sbp"
                     /product="3a0106s03: sulfate ABC transporter,
                     sulfate-bindin"
                     /accession_num="TIGR00971"

[etc, and ORIGIN with the sequence is correctly shown at the end]

genbank cds • 2.7k views

ADD COMMENT • link 11.8 years ago by Lee Katz ★ 3.2k

0

Entering edit mode

Ok... no help on this exact question yet. What about any help on finding documentation for sub features in a genbank file? I cannot understand from the basic genbank documentation on how to add sub features. Somehow GFF3 can do it but not Genbank--doesn't make sense.

ADD REPLY • link 11.8 years ago by Lee Katz ★ 3.2k

score 0 · Answer 1 · 2013-01-01

0

Entering edit mode

11.8 years ago

Lee Katz ★ 3.2k

I guess this is my best answer?

How Can I Save Bioperl Sequence Nested Features In Genbank Or Embl Format?

ADD COMMENT • link 11.8 years ago by Lee Katz ★ 3.2k