Question: Use bedtools intersect to see how much coverage you had of a specific coordinate
0
gravatar for aropri
2 days ago by
aropri10
aropri10 wrote:

How would I compare a bed file and a bam file using bedtools intersect to see how much coverage I had of specific coordinates in my bed file? Thank you in advance for your help!

chip-seq • 204 views
ADD COMMENTlink modified 23 hours ago by Alex Reynolds31k • written 2 days ago by aropri10
1
gravatar for jared.andrews07
2 days ago by
jared.andrews077.9k
Memphis, TN
jared.andrews077.9k wrote:

You want samtools depth.

ADD COMMENTlink written 2 days ago by jared.andrews077.9k
1

mosdepth (LINK) if you want a more performant option.

ADD REPLYlink written 2 days ago by genomax92k

Thanks, this is helpful, going to try out the couple of ways you guys suggested

ADD REPLYlink written 2 days ago by aropri10
1
gravatar for Alex Reynolds
2 days ago by
Alex Reynolds31k
Seattle, WA USA
Alex Reynolds31k wrote:

Here's one way:

$ bedmap --echo --count --echo-overlap-size --echo-ref-size --delim '\t' my_intervals.bed <(bam2bed < my_reads.bam) | awk -vFS="\t" -vOFS="\t" '{ print $0, ($NF-1)/$NF; }' > answer.bed

Note: Please see my second answer in this thread, which uses your specific inputs.

ADD COMMENTlink modified 23 hours ago • written 2 days ago by Alex Reynolds31k

Thank you, will try this out and see how it goes.

ADD REPLYlink written 2 days ago by aropri10

Running bedmap --help will describe the --count, --echo-overlap-size and --echo-ref-size options. This should give you numbers much like what other tools report.

ADD REPLYlink written 2 days ago by Alex Reynolds31k
chr12     75943973          75954960          0                 10987       0.999909

What do the last 3 columns represent

ADD REPLYlink modified 2 days ago by genomax92k • written 2 days ago by aropri10

I'm not sure why you would see three columns. There should be four columns: the number of reads which overlap the interval (--count), the total lengths of overlaps between reads and the interval (--echo-overlap-size), the length of the interval (--echo-ref-size), and the fraction of overlap length by interval (via awk). If you can post your inputs somewhere, I can take a look.

ADD REPLYlink written 2 days ago by Alex Reynolds31k

I would like to post my input so you can check, where do you suggest I upload it so you can see. I would have replied earlier but because I have a new account can only post so much in 5 hrs so couldnt reply.

ADD REPLYlink written 2 days ago by aropri10

I have copied the link where my bed and bam file is that I was using to measure coverage of my bed coordinates with the command you gave. Hope you can look at this and if the output is correct.

https://drive.google.com/drive/folders/1tXtRjoEvs3k2N-DZWayT5pR9iHy0slxM?usp=sharing

ADD REPLYlink written 2 days ago by aropri10

I'll try to take a look this weekend

ADD REPLYlink written 1 day ago by Alex Reynolds31k

Thank you, feel bad for bothering you. Even if you are not able to, no worries, appreciate your help.

ADD REPLYlink written 1 day ago by aropri10

I'm going to add a separate answer, to keep things clean.

ADD REPLYlink written 23 hours ago by Alex Reynolds31k

chr12 75943973 75954960 0 10987 0.999909

This is the output I got when for one of my regions when I ran this, what are the last 3 columns representing, a little confused there. Appreciate your help.

ADD REPLYlink modified 2 days ago • written 2 days ago by aropri10
1
gravatar for Alex Reynolds
23 hours ago by
Alex Reynolds31k
Seattle, WA USA
Alex Reynolds31k wrote:

First, please sort the BED file with sort-bed (not sortBed or UNIX sort):

$ sort-bed 10A_AT1_SE1.bed > 10A_AT1_SE1.sorted.bed

Then map the sorted BED file to the BAM file:

$ bedmap --skip-unmapped --delim '\t' --echo --count --bases-uniq --echo-ref-size --bases-uniq-f 10A_AT1_SE1.sorted.bed <(bam2bed < 10A_1_H3K.sorted.bam) > 10A_1_H3K.coverage.bed

The file 10A_1_H3K.coverage.bed will look something like this:

$ head 10A_1_H3K.coverage.bed
chr1    199064  200374  404 1310    1310    1.000000
chr1    6853080 6870040 204 9044    16960   0.533255
chr1    8004008 8027135 507 22308   23127   0.964587
chr1    8061658 8098833 366 37175   37175   1.000000
chr1    8174215 8211962 369 34253   37747   0.907436
chr1    8873384 8890703 434 17319   17319   1.000000
chr1    8874708 8890922 420 16214   16214   1.000000
chr1    15833039    15850691    271 17652   17652   1.000000
chr1    16643023    16645543    266 2520    2520    1.000000
chr1    16665212    16667822    281 2610    2610    1.000000

The last four columns are:

  • The number of reads in the BAM file that overlap an interval in the BED file by one or more bases
  • The number of bases in the interval in the BED file that have coverage by reads from the BAM file
  • The length of the interval in the BED file
  • The fraction of bases in the interval in the BED file that have coverage by reads from the BAM file

The order of the specified bedmap operands --count, --bases-uniq, --echo-ref-size, and --bases-uniq-f write the aforementioned column values in that same order.

If one base of read coverage is too lenient, additional operands are available to further constrain that criterion. Run bedmap --help and take a look at the "Overlap Options" section.

Hope this helps!

ADD COMMENTlink written 23 hours ago by Alex Reynolds31k

I really appreciate this, thank you for taking the time out of your day for the detailed answer. I am so thankful to you and this truly helps further my analysis and allows me to understand what is being done. All the thanks!

ADD REPLYlink written 19 hours ago by aropri10

Just one last question, I should be able to use the counts column that I have to normalize my reads by DESeq2. I ask this because i have three coverage.bed files and I want to normalize the read coverage between them. I have used DESeq2 before so was wondering would that method work here as well? Thanks for your help

ADD REPLYlink written 18 hours ago by aropri10

Yes, these are raw read counts.

ADD REPLYlink written 17 hours ago by Alex Reynolds31k

Thank you! really appreciate all your help

ADD REPLYlink written 14 hours ago by aropri10

Please upvote and accept this (and/or any other answers) that provided a solution for your problem. Use "green check" mark to do so. This will provide closure to the thread.

ADD REPLYlink modified 13 hours ago • written 13 hours ago by genomax92k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 2243 users visited in the last hour