Question: coverage per base using samtools depth and mpileup
1
gravatar for tonja.r
3.6 years ago by
tonja.r450
UK
tonja.r450 wrote:

 

I want to calculate the coverage for multiple bam files over multiple regions. I have ChIP-seq data for different histones, so I would like to take into consideration the positive and negative stranded reads. Also, I have multiple bam files, where the coverage should be determined over specific genomic regions.


I took a look on samtools depth and mpileup.
As far as I understood, depth and mpileup are doing exactly what I need. However, the results differ:
 

samtools depth -r chr11:56824193-57143746 file.bam | awk -F '\t' '{print $3}' | sort | uniq -c | sort -nr
214 16
184 17
99 18
93 19
59 20
39 22
27 21
15 23
4 24

samtools mpileup -r chr11:56824193-57143746 file.bam | awk -F '\t' '{print $3}' | sort | uniq -c | sort -nr

212 16
212 0
174 17
103 18
90 19
56 20
38 22
27 21
15 23
3 24

Why is there a difference in the counts?

Also, concerning my forward and reverse coverage: prior using samtools depth, I need to filter forward and reverse reads with samtools view:
samtools view -b -F 20 lala.bam 

as a result I am getting an unindexed .bam file, so I can't use depth directly.

So, will it be appropriate to use samtools mpileup with --rf 16 to calculate reverse reads and with --ff 20 to calculate positive reads?

 

sequencing • 3.8k views
ADD COMMENTlink modified 3.6 years ago by Shicheng Guo7.4k • written 3.6 years ago by tonja.r450
1

Just find an example of a position where there's a difference and look at that in IGV. It's likely that (A) the Phred threshold is different between the two tools and/or (B) depth isn't doing the overlap detection. It's also possible that (C) depth isn't using a MAPQ threshold (there are settings for that, but the default isn't mentioned).

ADD REPLYlink written 3.6 years ago by Devon Ryan89k
1
gravatar for tonja.r
3.6 years ago by
tonja.r450
UK
tonja.r450 wrote:

Yep. mpileup used minimum base quality for a base to be considered = 13.

ADD COMMENTlink written 3.6 years ago by tonja.r450
1

Glad it was an easy resolution!

ADD REPLYlink written 3.6 years ago by Devon Ryan89k
1
gravatar for Shicheng Guo
3.6 years ago by
Shicheng Guo7.4k
Shicheng Guo7.4k wrote:

Actually, the most direct way is use 'bedtools coverage', there are hundreds options in bedcoverage, you can find all the aim you want to do. 

ADD COMMENTlink written 3.6 years ago by Shicheng Guo7.4k

but isn't it slower? I tried coverageBed -a gene_coord_red.bed -b reads.sort.bam -d with only one region and it took much longer than mpileup with the same region.

ADD REPLYlink written 3.6 years ago by tonja.r450
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 672 users visited in the last hour