I am validating a flatfile of an annotated chloroplast genome scaffold. It includes /organism="Cannabis sativa"
and /organelle="plastid:chloroplast"
. This means the CDS should be translated according to the bacterial translation table and I therefore included the /transl_table=11
qualifier in my CDS annotations. See head of of the flatfile below (some info anonymized as XX):
ID XX_complete; SV 1; circular; unassigned DNA; STD; UNC; 153893 BP.
XX
AC XX_complete;
XX
DE Cannabis sativa sample XXX chloroplast, complete genome.
XX
KW .
XX
OS chloroplast XX_com
XX
FH Key Location/Qualifiers
FH
FT source 1..153893
FT /organism="Cannabis sativa"
FT /organelle="plastid:chloroplast"
FT /mol_type="genomic DNA"
FT misc_feature 1..84054
FT /note="large single copy (LSC)"
FT tRNA complement(2..75)
FT /gene="trnH-GUG"
FT /product="tRNA-His"
FT gene complement(2..75)
FT /gene="trnH-GUG"
FT CDS complement(398..1459)
FT /gene="psbA"
FT /product="photosystem II protein D1"
FT /translation="MTAILERRESESLWGRFCNWITSTENRLYIGWFGVLMIPTLLTAT
FT SVFIIAFIAAPPVDIDGIREPVSGSLLYGNNIISGAIIPTSAAIGLHFYPIWEAASVDE
FT WLYNGGPYELIVLHFLLGVACYMGREWELSFRLGMRPWIAVAYSAPVAAATAVFLIYPI
FT GQGSFSDGMPLGISGTFNFMIVFQAEHNILMHPFHMLGVAGVFGGSLFSAMHGSLVTSS
FT LIRETTENESANEGYRFGQEEETYNIVAAHGYFGRLIFQYASFNNSRSLHFFLAAWPVV
FT GIWFTALGISTMAFNLNGFNFNQSVVDSQGRVINTWADIINRANLGMEVMHERNAHNFP
FT LDLAALEVPSTNG"
FT /trans_splicing
FT /transl_table=11
FT gene complement(398..1459)
FT /gene="psbA"
As this is a scaffold, the manifest file includes MINGAPLENGTH 5
, see below:
STUDY XX
SAMPLE XX
NAME XX_chloroplast
FLATFILE flatfiles/XX.embl.gz
MINGAPLENGTH 5
ASSEMBLY_TYPE isolate
PROGRAM GetOrganelle v1.7.4.1
PLATFORM ILLUMINA
RUN_REF ERR11654556,ERR11654371
COVERAGE 243.3
However, when validating the flatfile, webin_cli (version 6.5.1) throws a series of errors about conflicting translation tables, see a single line below:
ERROR: organism classified. Submitted /transl_table "11" conflicts with translation table "1" recruited from taxonomy. Please check submitted /transl_table, /organelle and /organism for agreement. Contact us if necessary. [ line: 1 of XX.embl.gz]
I believe this is an bug because the translation table should not be recruited from taxonomy ("1") without taking into account the organelle information ("11") in the flatfile.
After some tests I found that:
- The error disappears after removing the /transl_table=11 qualifier. But that would imply an incorrect translation table - we do not want that.
- The error disappears when replacing
MINGAPLENGTH 5
with a reference to a chromosome file withChloroplast
as fourth column. However, this is only possible for chromosome-level assemblies, not scaffolds. - The error remains even when using an example chloroplast genome accession flatfile from ENA (MN857160.1). It therefore seems the error is not caused by some incorrect formatting of the flatfile.
Based on these observations, I suspect that the error is caused by the taxonomy being retrieved from the associated sample accession. For obvious reasons, the sample has a taxon qualifier and not a organelle or chloroplast qualifier.
Does anyone have an idea how I might resolve this issue?