1
1
Entering edit mode
2.9 years ago
LimMo ▴ 10

Hello all,

I'm using samtools in order to get the depth of a file called file.bam, but the output I have is like this:

samtools depth file.bam > output.txt

11  88911833    2
11  88911834    2
11  88911835    2


I understand that the first column is the name of the reference sequence, the second column is the base index within the reference, and the third column is the depth of coverage for that base.

But I need at the end the read depth represented like 100X, 150X and something like this.

Is there any calculation should I follow/apply to the output file?

depth samtools • 15k views
0
Entering edit mode

I think you want the average depth of your sequencing data, to do that you can simply take the distribution of individual bases depths and compute the mean.

11
Entering edit mode
2.9 years ago
benformatics ★ 2.6k

You can calculate the average coverage (for covered bases):

samtools depth  *bamfile*  |  awk '{sum+=$3} END { print "Average = ",sum/NR}'  This would be average coverage for all covered regions. If you want to include regions that were not covered in this calculation you need ot use: samtools depth -a To get your average X coverage you would need to divide by the total size of your genome, instead of dividing by NR in the command above. The total size can be calculated like this: samtools view -H *bamfile* | grep -P '^@SQ' | cut -f 3 -d ':' | awk '{sum+=$1} END {print sum}'


Then you would redo the calculation substituting this in for NR.

I stole all of this from: Tools To Calculate Average Coverage For A Bam File?

0
Entering edit mode

Thanks a lot for your help, I will try that :)

0
Entering edit mode

Hello, if I get from this calculation, let's say 1.7554 coverage, accounting for 0 depth reads. And 23 for non 0 reads. The coverage then (to represent as X) should be 1.7554*100=175.54 X?