7.1 years ago by
First of all you should pull out your region of interest from the big bam file. That will make your analysis faster and you would be able to calculate coverage specific for that region.
It should be sth like:
samtools view -bh your.sorted.bam chr1:100,000-20,000 (region of interest) > regionofinterest.bam
Now there are many posts on Biostars that tell you how to run pileup command in old samtools to get the number of reads that cover your region of interest. Basically, after running pileup command you will get a file that will contain all the positions in your region of interest and the coverage (or number of reads covering) that position. The positions that will have zero reads or zero coverage will not be present in the pileup file. Now you can write a script that can go through your pileup file and can get you the percentage of your region of interest that has 5 or more reads , 10 or more reads etc. Please read samtools commands and samtools pileup command from old samtools. Enjoy the reading.