Hello,
I am not sure if this has been answered before as I looked and couldn't find a simple answer.
I have a bam file, and all I want is to annotated all regions with 0 coverage in bed format. Is that possible?
Thank you,
Adrian
Hello,
I am not sure if this has been answered before as I looked and couldn't find a simple answer.
I have a bam file, and all I want is to annotated all regions with 0 coverage in bed format. Is that possible?
Thank you,
Adrian
You want the "genomecov" tool
Example:
$ bedtools genomecov -ibam aln.bam -bga | head
chr1  0       554304  0
chr1  554304  554309  5
chr1  554309  554313  6
chr1  554313  554314  1
chr1  554314  554315  0
chr1  554315  554316  6
chr1  554316  554317  5
chr1  554317  554318  1
chr1  554318  554319  2
chr1  554319  554321  6
To extract solely the intervals with no coverage just use awk to filter the results:
$ bedtools genomecov -ibam aln.bam -bga | awk '$4==0' | head -n 2
chr1  0       554304  0
chr1  554314  554315  0
Maybe have a look at bedtools and coverageBed. There are options to report depth at each position (if you want that resolution) and counts of overlaps for each.
-d  Report the depth at each position in each B feature.
    Positions reported are one based.  Each position
    and depth follow the complete B feature.
-counts Only report the count of overlaps, don't compute fraction, etc.
EDIT: forgot, but you would of course want to follow this by a filtering step (awk or grep) to get only the ones you want (e.g. the 0-count ones). Other alternatives would include running these through htseq-count or featureCounts followed by some similar filtering.
Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Yup, definitely better than my answer (I misinterpreted 0 coverage to mean overlaps with a list of features). :P
This is really cool! But then if I want to see what CDS features have a 0 coverage over the span of the the entire sequence of the CDS, how do I do that?
I could do bedtools intersect? I jsut need my CDS in .bed and then this output is also bed?
This should work.
This has worked very well, thank you!
I am now looking to calculate average coverage for my CDS features, one value per CDS, and I can't seem to find any way to do that.
Hi, did you get an answer for this. I have tried using several bedtool subcommands but always get 0 coverage per cds. The same files show good coverge graphs in e.g. Geneous.
Hi Aaron,
I just want to bump one of the replies to this comment. I would also like to summarize coverage by region. Is that doable with bedtools?