discordant results between bedTools and awk
0
0
Entering edit mode
8.9 years ago
bioguy24 ▴ 230

The output from the bedTools 2.17 command below:

coverageBed -abam IonXpress_009_150603.bam -b epilepsy70_medex.bed -counts > count.txt

and this awk

awk '{
  if(len==0) { last=$4;total=$6;len=1;getline }
  if($4!=last) {
    printf("%s\t%f\n", last, total/len);
    last=$4;
    total=$6;
    len=1
  }
  else {
    total+=$6;
    len+=1
  }
}
END {
  printf("%s\t%f\n", last, total/len)
}' output.bam.hist.txt > epilepsy70_average.txt

are different though, I believe, they are both calculating the average read depth per target in the bed file.

Count.txt

chr1    40539722    40539865    chr1:40539722-40539865    281
chr1    40542489    40542609    chr1:40542489-40542609    126
chr1    40544221    40544341    chr1:40544221-40544341    77

epilepsy70_average.txt

chr1:40539722-40539865    227.776224
chr1:40542489-40542609    104.300000
chr1:40544221-40544341    61.54166

I do not see a way to attach but the awk basically uses the output of:

coverageBed -hist -d -abam IonXpress_001.bam -b LCH_IDT > output.bam.hist.txt

​and combines and averages all baits that match. Why are the outputs different?

Thank you :)

next-gen awk bedtools • 1.8k views
ADD COMMENT
0
Entering edit mode

The bedtools coverage command does not calculate the average read depth, it calculates how many reads align to / overlap a region.

ADD REPLY

Login before adding your answer.

Traffic: 1500 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6