Internal stop codons in protein sequences
2
0
Entering edit mode
7.4 years ago

Hi, there are protein sequences with internal stop codons (* in between).

For example,

http://www.candidagenome.org/cgi-bin/protein/proteinPage.pl?dbid=CAL0000175821&seq_source=C.%20albicans%20SC5314%20Assembly%2022

What does these many internal stop codon means? While running bioinformatics predictions should I remove these *? Or should I remove these sequence from the fasta file before analysis?

Thanks in advance.

-Vivek Ananth

fasta sequence • 4.1k views
ADD COMMENT
3
Entering edit mode
7.4 years ago

That means the annotation is incorrect. The frame is wrong, or the sequence is wrong, or there isn't really a gene there.

ADD COMMENT
2
Entering edit mode
7.4 years ago

No idea why the amino acid sequence is so full of errors but, if you download the DNA sequence and translate it yourself, it produces a single open reading frame. You may want to contact the webmaster at CGD about this problem.

FYI, I thought it might be due to the alternative genetic code used by Candida (both nuclear and mitochondrial genes contain variant codons), but that's not the explanation.

ADD COMMENT

Login before adding your answer.

Traffic: 2921 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6