I have sequence from grapevines and I'm trying to see what viruses there are. I was able to assemble most of a viral genome (15000 out of 18000 bp in one contig). I'm trying to estimate the coverage. Here's a breakdown of what I did:
raw reads -> trim adapters -> map to grape reference and remove mapped reads -> assemble trimmed, unmapped reads.
To estimate the coverage, I used the virus that had the largest contig (grapevine leafroll associated virus 3) and I mapped the trimmed unmapped reads to it. I started with about 7 million reads and 1.3 million of them mapped to the genome. The average read length was 50 bp and the total genome size was 18Kbp. Using the equation presented in this: http://res.illumina.com/documents/products/technotes/technote_coverage_calculation.pdf I get
Coverage = Length of read * number of reads / haploid genome length
Coverage = 50 * 1.3x10^6 / 1.8x10^4
Coverage = 3611x? Could that be right?