How to generate BedGraph from BedTools Coverage?
3
2
Entering edit mode
6.2 years ago
SmallChess ▴ 560

The tool is at:

I want to get base coverage for specific regions (I'll provide a bed file). However, the output is not in a standard BedGraph format that I can export into R for visualisation.

Q: How do I draw density plot of coverage over the regions from the outputs generated by the coverage tool?

Q: Alternatively, how to convert the non-standard outputs to a proper BedGraph? For example, the Sushi R package takes a bedgraph input file.

bedtools bed genomic • 11k views
6
Entering edit mode
6.1 years ago

You should also consider bedtools genomecov, with the -bg option (bg being bedgraph)

Edit: After reading comments, and after some experimenting, what I think you want to do is:

bedtools intersect -a myBam.bam -b myRegions.bed > intersected.bam

bedtools genomecov -trackline -bg -ibam intersected.bam > intersected.bedgraph

Bedtools intersect will give only bam reads that overlap your regions (in myRegions.bed)

Then genomecov will draw a bedgraph file (coverage) only for the read depth for your filtered regions.

0
Entering edit mode

The problem is that this tool doesn't take a BED file filtering the regions that I'm interested. Therefore, it'd be too time consuming to calculate for all bases in the genome.

0
Entering edit mode

Fair enough, I didn't catch the subtlety in your question, but James Ashmore did.

1
Entering edit mode

I think this might be what you want to do - genomecov and then pull out the regions you want using bedtools intersect. But kind of hard to tell.

0
Entering edit mode

Thanks. I think your approach is the best.

0
Entering edit mode

Note: -abam is correct.

1
Entering edit mode
6.1 years ago

Get your output into a standard five- or more columned BED file and then use awk:

$awk '{ print$1"\t"$2"\t"$3"\t"\$5 }' foo.bed > foo.bedgraph
0
Entering edit mode

@ Alex: How can I convert my .sam file to standard .bed file? I have tried sam2bed < input.sam > output.bed it gives me output file but the column 5 always contains the same value (which is 99). According to UCSC bed format this is score column and to me this look strange that for every region the score is same. Then by using your mentioned command I obtain the .bedgraph file which I later use for ChromImpute. Problem is that when ChromeImpute convert these raw signal into certain sized bin signal (25 bp) then for every region I get the 0 signal.. I have tried many ways (using Pyisco, make wig files, genomeCoverageBed etc..) but I am unable to solve this problem. Can you help me with this one? Thanks.

0
Entering edit mode
6.1 years ago
James Ashmore ★ 3.2k

To generate bedGraph output from Bedtools coverage command you need to specify the -counts flag, for example:

bedtools coverage -a sample.bed -b sample.bam -counts > sample.bedgraph

Strictly speaking each row of a bedGraph file contains the chromosome name, the start position, the end position and then some value, which in your case would be the coverage, therefore you will want to keep only these fields, so if your had a bed file with six columns, you would want to keep only the first 3 columns (chrom, start, end) and the last column (coverage):

bedtools coverage -a sample.bed -b sample.bam -counts | cut -f1,2,3,7 > sample.bedgraph
2
Entering edit mode

It's not really clear to me from the original question that this is what they want. This would give one value for each genome region. I suspect they want "wiggly" values for each genome region. But could be wrong.