Question

coverage per base using samtools depth and mpileup

1

Entering edit mode

9.8 years ago

tonja.r ▴ 600

I want to calculate the coverage for multiple bam files over multiple regions. I have ChIP-seq data for different histones, so I would like to take into consideration the positive and negative stranded reads. Also, I have multiple bam files, where the coverage should be determined over specific genomic regions.

I took a look on samtools depth and mpileup.

As far as I understood, depth and mpileup are doing exactly what I need. However, the results differ:

samtools depth -r chr11:56824193-57143746 file.bam | awk -F '\t' '{print $3}' | sort | uniq -c | sort -nr
214 16
184 17
99 18
93 19
59 20
39 22
27 21
15 23
4 24

samtools mpileup -r chr11:56824193-57143746 file.bam | awk -F '\t' '{print $3}' | sort | uniq -c | sort -nr

212 16
212 0
174 17
103 18
90 19
56 20
38 22
27 21
15 23
3 24

Why is there a difference in the counts?

Also, concerning my forward and reverse coverage: prior using samtools depth, I need to filter forward and reverse reads with samtools view:

samtools view -b -F 20 lala.bam

as a result I am getting an unindexed .bam file, so I can't use depth directly.

So, will it be appropriate to use samtools mpileup with --rf 16 to calculate reverse reads and with --ff 20 to calculate positive reads?

sequencing • 10k views

ADD COMMENT • link updated 2.8 years ago by Ram 45k • written 9.8 years ago by tonja.r ▴ 600

1

Entering edit mode

Just find an example of a position where there's a difference and look at that in IGV. It's likely that (A) the Phred threshold is different between the two tools and/or (B) depth isn't doing the overlap detection. It's also possible that (C) depth isn't using a MAPQ threshold (there are settings for that, but the default isn't mentioned).

ADD REPLY • link 9.8 years ago by Devon Ryan 105k

score 1 · Answer 1 · 2015-09-15

1

Entering edit mode

9.8 years ago

tonja.r ▴ 600

Yep. mpileup used minimum base quality for a base to be considered = 13.

ADD COMMENT • link 9.8 years ago by tonja.r ▴ 600

1

Entering edit mode

Glad it was an easy resolution!

ADD REPLY • link 9.8 years ago by Devon Ryan 105k

Ram · Answer 2 · 2015-09-15

1

Entering edit mode

9.8 years ago

Shicheng Guo ★ 9.6k

Actually, the most direct way is use bedtools coverage, there are hundreds options in bedcoverage, you can find all the aim you want to do.

ADD COMMENT • link updated 2.8 years ago by Ram 45k • written 9.8 years ago by Shicheng Guo ★ 9.6k

0

Entering edit mode

But isn't it slower? I tried coverageBed -a gene_coord_red.bed -b reads.sort.bam -d with only one region and it took much longer than mpileup with the same region.

ADD REPLY • link updated 2.8 years ago by Ram 45k • written 9.8 years ago by tonja.r ▴ 600