Extract genomic coverage of specific regions from bam file
1
0
Entering edit mode
5.2 years ago

I have a bed file with regions of interest and a bam file from STAR - aligned RNA-Seq sample. I want to create a small bash script that would result in a txt file that reports position-wise coverage in these regions.

I have been playing with bedtools: I intersected the bed and bam and then I put the result into bedtools genomecov. The problem is that this tool creates a coverage for whole genome, so even though the bam has been intersected with the regions in bed and I end up with a file that has position-wise coverage it is 95% zeros, obviously... I think if I use bedtools coverage this might work but then I noticed a problem described here: https://github.com/arq5x/bedtools2/issues/534
In order to be consistent with IGV I would like to exclude the gaps from alignments.
Any ideas how to tackle this in another way?

RNA-Seq alignment • 5.2k views
ADD COMMENT
0
Entering edit mode

My favorite in getting coverage stats is mosdepth.

$ mosdepth --by capture.bed sample-output sample.bam

This will create per-base and per-region coverage stats for the regions given in the bed file.

ADD REPLY
0
Entering edit mode

Thanks! I have just checked it out - unfortunately the sample-output.per-base.bed.gz, again, contains information for the whole genome whereas I just want the per-base coverage in bed files... But this is a bed as well... maybe I can use bedtools intersect to intersect it with the original bed with my regions?

ADD REPLY
0
Entering edit mode

You could you please show some lines of your bed file?

ADD REPLY
0
Entering edit mode

It seems that my solution worked as I wanted it. Just FYI: The mosdepth per-base output head:

1   0   10001   0
1   10001   10010   3
1   10010   10107   5
1   10107   10113   3
1   10113   10119   2

One of the regions of interest from bed files

5   134374371   134390285   UBE2B   0   +
ADD REPLY
1
Entering edit mode
5.2 years ago

The solution is to use mosdepth to create position-wise coverages (the output is in bed) and then intersect them with the regions in bed with bedtools intersect.

ADD COMMENT

Login before adding your answer.

Traffic: 1441 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6