Entering edit mode
6.8 years ago
VicGB
•
0
So I have a bam file with a huge coverage in some regions and I want to subsample randomly only reads that cover that zones, without sampling reads that cover zones of low coverages.
Is there any tool or something? Thanks!
So the objetive of my "cherry-picking (lol?)" I that I'm using a genome assembly pipeline that does 20 random subsamplings to down the coverage to 150x in every subsampling and then do the consensus to reconstruct the genome. The problem is that I have some samples where coverage is so huge in some regions but relatively low in others, so when I make the random subsampling it decreases coverage along all the regions included those with less than 150x, so that it would be suitable to only do the subsamplings of the reads covering high coverage regions but not in the other ones, because during consensus it creates gaps in the reconstructed genome.
I did not know that samtools could do this, now that I looked it up indeed there is such a feature. Thought the correct command would be
Picard also has a similar function
DownSample
that you may usehttps://broadinstitute.github.io/picard/command-line-overview.html#DownsampleSam