When performing genome assembly, it usually need high coverage data, such as more than 50X for pacbio, in my opinion, 10X would be enough to correct the high-error reads and assembly the genome. Considering we can't get the uniform coverage across the genome, thus we increase the total coverage to ensure most region were covered by 10 reads to perform correction. Also, as for illumina and sanger data, the high coverage will make most region be covered. Am I right?
If I am right above, do we really need 100X, how do we know the minimal dataset we need? Is any protocol available to get uniform genome DNA from blood or tissues to avoid the high coverage?
Any suggestion would be grateful! Best wishes!