Does not begin with a recognised start codon
1
0
Entering edit mode
6.9 years ago

The problem I am facing is that I am getting errors like Warning: Sequence 280 "DNEG10010009 " does not begin with a recognised start codon. Warning: Sequence 280 "DNEG10010009 " has 16 internal stop codon(s)

The input I have provided is a sequence of Nucleotide sequences for each individual gene of Bacillus Subtilis 168.

My question is how do I solve this error? Actually all I really want to do is to calculate various values like codon adaptation index,codon bias index,number of optimal codons and so on for each gene of Bacillus Subtilis 168 .For that I am using the software CODONW.I would also like to know if the sequence that I am providing as input is correct or not and if not what should I provide as input.

The other doubt I have is that do I input Open Reading Frames for Codon Analysis.If a particular gene sequence has multiple ORF's which one should be chosen?

Here are some sequences for which I am getting an error: Please tell me how to solve this error.I think I need to convert them to ORF's before giving them as input but I am not sure.The errors I am getting are only for NonEssential genes of Bacillus Subtilis 168.

>DNEG10010082
TTGATAGGGCAGAAAGCTTGGGTGAACATTGGCAAGACCGAATTCATCTTGCTTCTTGTC
GTTGGAATTTTAACCATCATCAATGTACTAACAGCAGACGGAGAAAAGCGTACATTTCAT
TCTCCTAAGAAAAAGAATATCAATCATTTAACCCTTTATGATTGCGTATCTCCGGAAGTT
CAGAACAGTATAAACGAAACAGGGCGTGTGACAAACTTCTTTTGA

>DNEG10010083
ATGAATCAAAATCAGTTGATATCGGTAGAGGATATCGTATTTCGATATCGGAAGGACGCA
GAAAGACGAGCACTAGACGGCGTCTCCCTGCAGGTGTATGAGGGTGAATGGCTTGCAATC
GTAGGTCATAACGGTTCAGGGAAATCAACACTGGCCCGGGCATTGAATGGTTTAATTCTT
CCTGAATCAGGCGACATTGAGGTTGCCGGGATTCAATTGACAGAGGAATCTGTTTGGGAA
GTGCGTAAGAAGATAGGTATGGTCTTTCAAAATCCGGATAACCAATTTGTCGGAACGACT
GTTCGCGATGATGTGGCTTTTGGTTTAGAAAACAATGGTGTACCGCGGGAAGAAATGATT
GAGAGAGTAGACTGGGCAGTAAAACAGGTGAATATGCAAGATTTTCTCGATCAAGAGCCG
CACCATCTCTCCGGAGGCCAAAAGCAGAGAGTTGCGATTGCGGGGGTTATTGCCGCACGT
CCTGATATTATTATCTTAGATGAAGCAACATCCATGCTTGATCCGATCGGGCGAGAAGAA
GTGCTTGAAACGGTAAGACATTTAAAAGAGCAGGGCATGGCGACTGTCATATCCATTACA
CATGACCTGAATGAGGCAGCAAAAGCAGACAGGATCATTGTCATGAATGGCGGTAAAAAA
TATGCTGAAGGGCCGCCTGAAGAGATTTTTAAATTGAATAAAGAACTTGTTCGAATTGGG
CTTGATTTACCCTTCTCATTCCAGCTTAGCCAGCTTTTAAGAGAAAATGGACTGGCTTTG
GAAGAAAACCATTTGACTCAGGAAGGGCTGGTGAAAGAGCTGTGGACATTACAATCAAAG
ATGTAG
gene sequence codon codonw codon adaptation index • 1.6k views
ADD COMMENT
0
Entering edit mode

What codons does CodonW consider as valid? I would expect that the program is able to handle codons that are not ATG as a start codon, so it would suggest your sequences are malformed in some way.

Some things to check off the top of my head: - Are you uploading/running with the right sequence format (does it expect fasta headers for example? - it may be tripping up on the ">" character) - Check your sequences are what you think they are (blast a couple perhaps? to ensure you have the full sequence covered) - As Istvan suggested, perhaps you're out of frame, or missing parts of the sequence.

You should upload some examples of the sequences you're trying to use before we can help you properly though (can't say anything for certain without seeing what you're working with).

ADD REPLY
0
Entering edit mode
>DNEG10010082 
TTGATAGGGCAGAAAGCTTGGGTGAACATTGGCAAGACCGAATTCATCTTGCTTCTTGTC 
GTTGGAATTTTAACCATCATCAATGTACTAACAGCAGACGGAGAAAAGCGTACATTTCAT 
TCTCCTAAGAAAAAGAATATCAATCATTTAACCCTTTATGATTGCGTATCTCCGGAAGTT 
CAGAACAGTATAAACGAAACAGGGCGTGTGACAAACTTCTTTTGA

> DNEG10010083 
ATGAATCAAAATCAGTTGATATCGGTAGAGGATATCGTATTTCGATATCGGAAGGACGCA 
GAAAGACGAGCACTAGACGGCGTCTCCCTGCAGGTGTATGAGGGTGAATGGCTTGCAATC 
GTAGGTCATAACGGTTCAGGGAAATCAACACTGGCCCGGGCATTGAATGGTTTAATTCTT 
CCTGAATCAGGCGACATTGAGGTTGCCGGGATTCAATTGACAGAGGAATCTGTTTGGGAA 
GTGCGTAAGAAGATAGGTATGGTCTTTCAAAATCCGGATAACCAATTTGTCGGAACGACT 
GTTCGCGATGATGTGGCTTTTGGTTTAGAAAACAATGGTGTACCGCGGGAAGAAATGATT 
GAGAGAGTAGACTGGGCAGTAAAACAGGTGAATATGCAAGATTTTCTCGATCAAGAGCCG 
CACCATCTCTCCGGAGGCCAAAAGCAGAGAGTTGCGATTGCGGGGGTTATTGCCGCACGT 
CCTGATATTATTATCTTAGATGAAGCAACATCCATGCTTGATCCGATCGGGCGAGAAGAA 
GTGCTTGAAACGGTAAGACATTTAAAAGAGCAGGGCATGGCGACTGTCATATCCATTACA 
CATGACCTGAATGAGGCAGCAAAAGCAGACAGGATCATTGTCATGAATGGCGGTAAAAAA 
TATGCTGAAGGGCCGCCTGAAGAGATTTTTAAATTGAATAAAGAACTTGTTCGAATTGGG 
CTTGATTTACCCTTCTCATTCCAGCTTAGCCAGCTTTTAAGAGAAAATGGACTGGCTTTG 
GAAGAAAACCATTTGACTCAGGAAGGGCTGGTGAAAGAGCTGTGGACATTACAATCAAAG 
ATGTAG
ADD REPLY
1
Entering edit mode

You need to show us the sequence DNEG10010009 which is the one the program is complaining about in the error you copied.

Those 2 sequences, at least, are fine as far as I can tell. DNEG10010083 starts with an ATG, which is the canonical start codon, and DNEG10010082 starts with TTG which is also a legitimate start codon. So unless CodonW isn't able to understand start codons other than ATG (which I would find pretty unlikely if the software is even remotely decent), then it must be a problem with (at least) the DNEG10010009 record.

ADD REPLY
0
Entering edit mode
6.9 years ago

This means that your nucleotide sequences are not quite right (possibly out of phase).

ADD COMMENT

Login before adding your answer.

Traffic: 1478 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6