Dear community, we have been ask to project the read count for our samples, we have to select between:
- 1 flow cell
- 1 lane
- more than one flow cell
- between 1 flow cell and 1 lane
We have 6 mouse lines, 10 individuals in each one. So we have 60 samples.
I am new to this, but I read up and found this thechnical note: http://www.illumina.com/Documents//products/technotes/technote_coverage_calculation.pdf wehre the use the Lander/Watermann equation: C = LN/G (coverage = read length * number of reads / length of genome)
One can solve for N (number of reads): N = CG/L The length of the genome (2.8G) and the read length(125) can be found, but I am not sure if the coverage is a constant for the machine (HiSeq2500).
Also, in the case reads are paired end (2x125), do we have to multiply by 2? (2*(CG/L)) and how could one factor in multiplexing?
There I get (for 30x coverage): Output Required: 85,714,285,714 bases
But the number of reads required is not shown. Should I jus devide Output Required by the length of reads?
Quality of your libraries will determine yield, which would be cluster's passing filter (in addition to run-time non-biological variation). Each cluster passing filter will yield 2 reads if you are doing paired-end sequencing.