Hi all. I'm in the process of uploading a draft bacterial genome assembly to NCBi. NCBI asks that you give a coverage estimate based on #bps sequenced/ expected genome size x % of bps placed in final assembly. I have calculated this using the kmergenie estimate for expected genome size as this is a de novo project, the numbers are as follows: Forward read fastq file: Num reads:5261180 Num Bases: 1575030702 Reverse read fastq file: Num reads:5049184 Num Bases: 1511690223 (1575030702+1511690223) = 3086720925 (i.e total bps sequenced) kmergenie genome size estimate: 4727586 Actual assembly size: 4706279
This gave a coverage calculation of: (3086720925/ 4727586) x ((4706279/3086720925)x100)= 99.549304867
I am inexperienced but this seems a high coverage- does this calculation seem sensible?