I have a human WGS sequences formatted with FASTQ from HiSeq2000.
In a raw sequence QC using FastQC, I knew that Total Sequences (fastqc_data.txt) is about 436Mb.
After an alignment using BWA MEM, I also knew that Genome territory is about 2,86Gb and Mean coverage is about 31X from CollectWgsMetrics of Picard.
Genome territory means that the number of non-N bases in the genome reference over which coverage will be evaluated and Mean coverage means that the mean coverage in bases of the genome territory, after all filters are applied.
I can't understand the relation between Total sequences(FastQC) and Mean coverage(Picard). How can 31X cover to 2.86Gb genome using 436Mb sequences? Please explain about these relations.
Plus, hundreds of mega bases throughput is normal in WGS?
Any comments are welcome.