I have a EMBL file ready for ENA submission but failing on translation table. It should be 5 and added to file but it seems to expect translation table 1. I can't see where any addition could be to make it pick it up? I added XXX so you don't know the species. Error seems to suggest something on first line, any ideas?

ID   XXX; XXX; circular; genomic DNA; XXX; XXX; 15307 BP.
XX
AC   XXX;
XX
AC * _Mitochondria
XX
PR   Project:PRJEB11111;
XX
DT   01-May-2020 (Rel. 133, Created)
XX
DE   XXX
XX
KW   .
XX
OS   XXX
XX
RN   [1]
RP   1-15307
RG   XXX
RT   ;
RL   Submitted (01-MAY-2020) to the INSDC.
XX
FH   Key             Location/Qualifiers
FH
FT   source          1..15307
FT                   /mol_type="genomic DNA"
FT                   /organelle="mitochondrion"
FT                   /organism="XXX"
FT   gene            complement(12989..13705)
FT                   /locus_tag="cox2"
FT   mRNA            complement(12989..13705)
FT                   /locus_tag="cox2"
FT   CDS             complement(12989..13705)
FT                   /locus_tag="cox2"
FT                   /transl_table=5


Error:

ERROR: organism classified. Submitted /transl_table "5" conflicts with translation table "1" recruited from taxonomy. Please check submitted /transl_table, /organelle and /organism for agreement. Contact us if necessary. [ line: 1 of MT3.embl.gz]

Entering edit mode

Did you check Please check submitted /transl_table, /organelle and /organism for agreement.?

What does that specifically mean though. I have done below.

The translation table is correct for that species, I checked on NCBI to confirm. The organelle is "mitochondrion" and species is correct so yes. Seems to not pick up it is a mitochondrial sequence. NCBI BLAST name: moths Rank: species Genetic code: Translation table 1 (Standard) Mitochondrial genetic code: Translation table 5 (Invertebrate Mitochondrial)

See translation table codes here. Are they matching what you submitted?

I don't understand... that link shows what I have

I have a insect mitochondria annotated sequence. The translation table is below? and same as what I have? 5. The Invertebrate Mitochondrial Code (transl_table=5)

yes they match

It would be good if there was a working example of a flat file so can see what the header looks like for Mitochondria but I can't find one. I've contacted ENA and their suggestion I had already tried and did not work so I sent it back to them. Doesn't appear obvious to them either, which suggests documentation is not good enough..

you could go to ENA and query a mitochondrion sequence?

perhaps adding an 'OG' line underneath the OS one might help?

OG Mitochondrion seems to be the critical line.

ADD REPLY
I tried that before and just gave it another go, and no, adding that gives the same error.

tried everything in here to match

are you running this through the ENA java tool validator?

yes latest version of validator

thats what gives the error Error: ERROR: organism classified. Submitted /transl_table "5" conflicts with translation table "1" recruited from taxonomy. Please check submitted /transl_table, /organelle and /organism for agreement. Contact us if necessary. [ line: 1 of MT3.embl.gz]

I could just take out the correct translation table and let it go through as 1 but that would mean incorrect translations

ADD REPLY
ADD REPLY
FT                   /organelle="mitochondrion"


is not correctly formatted. Can you check the number of spaces is correct and there are no weird/hidden chars on that line?

If that line is correctly present in the embl record it only provides a warning/info about the transl_table ; if the line is wrong or missing it indeed outputs an error on the /transl_table

I did a dos2unix on manifest file and embl file but no change. Also checked spaces. so irritating.

dos2unix might not fix all "weird" chars. You could open it in vi and then do :set list , that will show all hidden chars.

how did you generate this file?

The problem is probably related to your Chromosome list file. Please check it. See here for information: https://ena-docs.readthedocs.io/en/latest/submit/fileprep/assembly.html# chromosome-list-file.
Could you show what you have in that file?

Mt MIT chromosome Mitochondrion

I think it is inconsistent. In the flat file the sequence is called Mitochondria (cf AC line. It is extracted from the fasta file by EMBLmyGFF3). While in your chromosome list file it is called Mt. You should replace Mt by Mitochondria. Then try again to validate

I tried that but still no joy. I'll have to go through again line by line and check all details again, maybe i'm missing something obvious. Would be good if there was an actual human readable error to these things.

The new webin-cli 2.2.3 requires a manifest file and a chromosome list that are formatted according to the below guidelines:

Manifest file guidance: https://ena-docs.readthedocs.io/en/latest/submit/assembly/genome.html# manifest-files

Chromosome list file guidance: https://ena-docs.readthedocs.io/en/latest/submit/fileprep/assembly.html# chromosome-list-file

that might be it, I have not listed the chromosome list file in the manifest. I'll give that a go.

Yes solved it, but still got the abutting gene error that they say they fixed in 2.2.4 but ill contact them to check they have fixed it.

