Excluding reads for Coverage Calculation in given regions
0
0
Entering edit mode
6.3 years ago

Hello,

I have a bam file and I have a bed file with my region of interests. Now I'm interested in the mean coverage for each of these region. That's no problem with picard's CollectHsMetrics and the PER_TARGET_COVERAGE flag.

But now I would like to exclude those reads in coverage calculation that doesn't overlap my target region with a given fraction or at least given bases.

I feel that bedtools might help me here. But I didn't figure out until now how.

Can someone help me?

fin swimmer

bam bedtools coverage • 1.9k views
ADD COMMENT
1
Entering edit mode

Hi , I think you are looking for bedtools intersect , which go a clear documentation here http://bedtools.readthedocs.io/en/latest/content/tools/intersect.html :) You will just need 2 bed, your region of interest and your bed you got in you first step calculation :) Best

ADD REPLY
0
Entering edit mode

Hello, thanks for your answer. This is what I do now:

  1. Remove all reads from the bam file that doesn't overlap with at least the given fraction

    bedtools intersect -a full.bam -b regions.bed -f 0.75 -wa > restricted.bam

  2. Use the resulted bam file for coverage calculation with my tool if choice.

That works like it should, but I need to iterate twice over the alignment file. So I'm also interessed in faster ways.

fin swimmer

ADD REPLY
0
Entering edit mode

Why don't you calculate your coverage in your bam with you final bed (i mean region of interest ) using bedtools ?

ADD REPLY
0
Entering edit mode

bedtools needs to much memory for calculate the coverage and I'm very low at it. Also I miss the possibilty to ignore reads/bases below a given quality treshold, like in the picard tools. And I'm more satisfied with the output of picard tools.

How does bedtools handle overlapping paired reads in coverage calculation? Are they counted as 1 or 2?

fin swimmer

ADD REPLY
0
Entering edit mode

There is an option for bedtools genomecov (http://bedtools.readthedocs.io/en/latest/content/tools/genomecov.html) named :

-pc Calculates coverage for paired-end reads, coverage is calculated as the number of fragments covering each base pair .

To got better performances you may be can split your bed in chromosome ...

ADD REPLY

Login before adding your answer.

Traffic: 1720 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6