High CDS Count in my assembled Genome using Nanopore reads (ONT) data
1
0
Entering edit mode
4.7 years ago
Optimist ▴ 180

Hi all,

I have assembled multiple bacterial genomes sequenced using Oxford Nanopore Minion (FLO-MIN106 flowcell) sequencer.

I have used Pomoxis, Unicycler assemblers to perform the genome assembly. Upon annotating the resultant fasta files of the genome assembly using RAST and PATRIC, I have observed the CDS number to be abnormally hight (Double in some cases) when compared to existing assemblies.

CDS ratio rages from 0.44 to 0.60 (Normal CDS ratio prescribed by NCBI ranges between 0.8 and 1.2).

How can I overcome this issue of abnormal CDS count issue. What is the way forward?

Thanking you all

High CDS WGS Nanopore Assembly • 1.3k views
ADD COMMENT
2
Entering edit mode
4.7 years ago
h.mon 35k

As this is a Nanopore-only assembly, there are many errors (mainly indels) which negatively affect gene prediction:

Nanopore only assembly errors

Mind the gaps – ignoring errors in long read assemblies critically affects protein prediction

ADD COMMENT
2
Entering edit mode

If your consensus accuracy is 99.9% then you still have 1 errors every 1000 bp. A typical bacterial gene is ~ 1000bp long. That 1 error is usually an indel. This results in a frame-shift in your CDS. If you use a gene finder like Prodgial (used in prokka) then you will get ~2 predicted CDS for every real CDS. You need to also sequence it with Illumina and polish the nanopore assembly.

ADD REPLY
0
Entering edit mode

One note about this : there's probably already Illumina data out there for your strains of interest. Check this rather nice program to locate and download SRA or ENA data more quickly:

https://ewels.github.io/sra-explorer/

ADD REPLY
0
Entering edit mode

There is no Illumina data available for the isolates under study.

ADD REPLY

Login before adding your answer.

Traffic: 1940 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6