How to I down-sample to say 13 million bases total?
Quick summary: I'm trying to get even depth of coverage and each fastq file has different read lengths & reference size.
More details: I have 60 samples and each fastq file has slightly different read length. So down-sampling by read count becomes complicated and each sample needs a different number of reads. I also have multiple bed files and I'm trying to make sure each bed file has exactly 13x coverage when I align the read to the reference.
I could automate this and calculate average read length of each fastq file, do the math and then dynamically downsample that fastq file to the appropriate levels. However, I would hope there was an easier way and I can simply just down-sample to a specific base count.