yn00 in PAML returns error: Error in sequence data file: E at 3 seq 1.
3
0
Entering edit mode
4.8 years ago

Hi everyone,

I am running yn00 to analyze around 35000 alignments. My codon alignments were generated by pal2nal in PAML input format. When I run the yn00, dn,ds is calculated only for around 300 alignments but not for the rest ~34000 alignments. I tried running these alignments one by one and I get the same error for all of them.

Error in sequence data file: E at 3 seq 1. Make sure to separate the sequence from its name by 2 or more spaces

I have checked my alignments and they seem to be fine, as in , in multiples of three. It's very confusing as ~200 of them seem to not have this problem although they were all generated by pal2nal. I am attaching a few alignments that did not run. Please help me figure this out. Thank you.

2 642

ENSMUSG00000004821_ENSMUST00000004943_Tmed11_5_108777235_108795363 ATGCAAATTCAGACAATTCTTTTATGTTTTAGCTTTTCATTTTCAGCTGCTTTTTATTTC CATGCTGGAGAGCGAGAGGAGAAATGTATAATTGAAGACATTCCAAGTGATACATTGATA ACAGGGACATTCAAGGTACAGCAGTGGGACATAGTCAGACATGACTTCCTTGAATCTGCT CCTGGCTTAGGAATGTTTGTGACTGTTACAACTAATGATGAGGTATTATTATCCAAGTTA TATGGTGCACAAGGAACATTCTATTTTACTTCTCATTCATCTGGTGAACACATCATTTGC TTAGAATCTAATTCTACACAGTTTGTGTCATTTGGAGGAAGTAAGCTGCGCATCCACTTA GATATTCGAGTTGGAGAACATGACCTTGATGCAGCTATTGTTCAAGCAAAGGATAAAGTT AATGAAGTAACCTTCAAGCTTCAACATCTAATTGAACAAGTGGAGCAAATACTCAAAGAA CAAGACTATCAAAGGGACCGTGAAGAAAATTTCCGTATAACCAGTGAAGATACCAATAGA AATGTTTTATGGTGGGCTTTTGCACAAATATTGATCTTTATCTCAGTTGGAATTTTTCAA ATGAAACACCTTAAAGATTTCTTCATAGCTAAGAAGCTTGTT ENSRNOG00000000035_Tmed11_ENSRNOT00000000040_14_1932659_1953305 ATGCAAACTCAGACAATTCTCTTATGTTTCAGTTTTTCCTTTTCAGCTGCTTTTTATTTC CATGCTGGGGAGCGAGAGGAGAAATGTATAATCGAAGACATTCCAAGTGACACGTTGATA ACAGGGACATTCAAGATACAGCAGTGGGACATTGGTAGACATGACTTTCTTGAATCTGCT CCTGGCTTAGGAATGTTTGTGACTGTTACAAACAATGATGAGGTATTATTATCCAAGTTA TATGGTGCACAAGGGACATTCTATTTTACTTCACACTCATCTGGTGAACACATCATTTGC TTAGAATCTAATTCTACACAATTTGTGTCATTTGGAGGGAGTAAGCTGCGCATCCACTTA GATATTCGAGTTGGAGAGCATGACCTTGATGCAGTTATTGTTCAAGCAAAGGACAAAGTT AATGAAGTAGCCTTCACGCTTCGACATCTAATTGAACAAATTGAACAAATACTCAAAGAA CAAGACTATCAAAGGGACCGTGAGGAAAATTTCCGTATCACCAGTGAAGATACCAATAGA AATGTTTTATGGTGGGCTTTCGCACAAATATTAATCTTTATCTCAGTTGGAATTTTTCAA ATGAAGCACCTTAAAGATTTCTTCATAGCTAAGAAGCTTGTT

Sorry for this format.I couldn't figure out how to attach a file here.

PAML pal2nal phylip input file • 3.2k views
0
Entering edit mode
4.0 years ago
lxw34 • 0

I believe that your sequence name exceeds the max. of 30 characters.

0
Entering edit mode
3.5 years ago
al-ash ▴ 190

Please show also the headers of the alignments which "are fine" to see if the problem is header length.

I had this error when my input phylip file had between sequence name and sequence tab instead of two spaces. Replacing the tab by two spaces (e.g. in bash via sed 's/\t/ /') fixed the problem - that might be another thing to check.

0
Entering edit mode
2.6 years ago

Hi, have you figured out this problem, I got a same question with you, can you please give me some suggestions about it?