Question: BEDtools coverage output
1
gravatar for novice
4.9 years ago by
novice990
United States
novice990 wrote:

I'm trying to produce a simple window => coverage table from a bam file. I produced (non-sliding) windows with bedtools makewindows and then ran bedtools coverage -abam alignment.bam -b reference.windows > coverage. The problem is that I cannot make sense of the output I got. Here's a sample:

    I    0    215    SRR1297046.176411/2    23    +    0    215    0,0,0    1    215,    0,    1    215    215    1.0000000

    I    0    216    SRR1297046.2210768/1    23    +    0    216    0,0,0    1    216,    0,    1    216    216    1.0000000

    I    1    222    SRR1297046.178368/1    24    +    1    222    0,0,0    1    221,    0,    1    221    221    1.0000000

The first field is the chromosome name and the fourth one is the read's, but what are the others? This output is inconsistent with the ones described here http://bedtools.readthedocs.org/en/latest/content/tools/coverage.html?highlight=coverage. How can I convert this output into a "window => coverage" table? 

 

I'd appreciate any help.

coverage alignment bedtools • 7.8k views
ADD COMMENTlink modified 4.9 years ago by Devon Ryan97k • written 4.9 years ago by novice990
2
gravatar for Devon Ryan
4.9 years ago by
Devon Ryan97k
Freiburg, Germany
Devon Ryan97k wrote:

If you want the coverage of each entry in reference.windows then bedtools coverage -a reference.windows -b alignment.bam > coverage is what you want.

BTW, this will probably double count overlapping paired-end reads. You could alternatively use samtools depth with some post-processing and not have to worry about that.

ADD COMMENTlink modified 4.9 years ago • written 4.9 years ago by Devon Ryan97k

Thank you Denvon! That worked. I tried samtools depth -b reference.windows alignment.bam but it gave me the coverage at each position, not windows. Did you mean that it'd be more accurate if I post-processed this output to get what I want?

I also tried samtools bedcov reference.windows alignment.bam, which gave me the coverage at each window. However, the coverage values were way bigger (in hundreds of thousands) than those output by bedtools (in thousands). Do you know why that is?

ADD REPLYlink modified 11 months ago by RamRS30k • written 4.9 years ago by novice990
1

Ah, the "bedcov" command is new. It reports the sum of the per-base coverage, so just divide that by the window size to get the average coverage.

ADD REPLYlink written 4.9 years ago by Devon Ryan97k
1

I think both samtools depth and bedcov will double count overlapping read pairs, unless the overlapping part has been clipped in one of the two reads (I use clipOverlap for that). However, samtools mpileup should do the right thing unless the option --ignore-overlaps (disable read-pair overlap detection) is set.

But if you want to count the fragments per window (as in a typical RNA-Seq or ChIP-Seq experiments) than you could use bedtools coverage using only read 1 or better merge read pairs to a single fragment.

ADD REPLYlink written 4.9 years ago by dariober11k
1

Indeed, setting -Q 1 with bedcov (and -q 1 with depth) would presumably get around this, though obviously one is then filtering out phred 0 bases as well.

ADD REPLYlink modified 11 months ago by RamRS30k • written 4.9 years ago by Devon Ryan97k
1

Mmm... -q 1 will filter for base quality but it will not affect the overlapping issue.

ADD REPLYlink written 4.9 years ago by dariober11k
1

Overlaps are removed in mpileup internally by setting the phred scores to 0. It's a nasty hack, but that's how it's done.

ADD REPLYlink written 4.9 years ago by Devon Ryan97k

For the record, using samtools mpileup doesn't yield the same result as using samtools bedcov -q 1 or depth -Q 1. At least according to my results. I post-processed the outputs from all of the commands and found that mpileup gives a lower coverage than both depth -Q 1 and bedcov -q 1 (which give the same coverage).

Here's the script I used to get the average coverage per window from mpileup's output. Takes the window size (non-sliding) as an argument. Can be modified easily to work with depth.

Example usage:

samtools mpileup alignment.bam | perl windows.pl 10000 > coverage

I hope someone would find it useful:

#! /usr/bin/perl

my $window = shift @ARGV
  or die "Specify a window size!";

while (<>) {
  my $pos = (split)[1];

  $cov += (split)[3]; #change this to 2 for samtools depth output

  unless ($pos % $window) {
    printf "%s\t%s-%s\t%s\n", (split)[0], ($pos - $window), $pos, ($cov / $window);
    $cov = 0;
  }
}
ADD REPLYlink modified 11 months ago by RamRS30k • written 4.9 years ago by novice990

Mpileup performs significantly more filtering, which is why you get discordant results.

ADD REPLYlink written 4.9 years ago by Devon Ryan97k
Hi Denvon, Do you know what kind of filtering mpileup does? I'm not sure which of these options I should rely on. BTW, bedtools coverage gives slightly less average coverage per window than mpileup.
ADD REPLYlink written 4.9 years ago by novice990

See the defaults for -A, -d and -Q (as well as -x).

ADD REPLYlink modified 11 months ago by RamRS30k • written 4.9 years ago by Devon Ryan97k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1906 users visited in the last hour