Possible error in BEDtools coverage -hist?
0
0
Entering edit mode
7.8 years ago
orlando.wong ▴ 60

I am using bedtools coverage -hist to calculate coverage and depth for a region in chr4 (83739813 - 94128608), which is approximately 10 megabase pairs.

The third column (size of the file, 10kb.window.bed) is 24522538? Shouldn't it be computed to be 10 mb? (83.7 mb - 94 mb)?

Here are the commands I used. (I am using Cygwin with bedtools v. 2.25.0)

$ bedtools coverage -hist -a 10kb.window.bed -b der4_83.8mb-93.8mb.bed | grep ^all | head

all     0       23171185        24522538        0.9448934
all     2       295290  24522538        0.0120416
all     4       150622  24522538        0.0061422
all     6       88055   24522538        0.0035908
all     8       74499   24522538        0.0030380
all     10      59533   24522538        0.0024277
all     12      50732   24522538        0.0020688
all     14      43069   24522538        0.0017563
all     16      35698   24522538        0.0014557
all     18      31927   24522538        0.0013019

Here is file A (head and tail). It's chr4: 83739813 - 94128608. I used bedtools makewindows to make 10 kb windows for the coverage calculation.

$ cat 10kb.window.bed |head
chr4    83739813        83749813
chr4    83749813        83759813
chr4    83759813        83769813
chr4    83769813        83779813
chr4    83779813        83789813
chr4    83789813        83799813
chr4    83799813        83809813
chr4    83809813        83812419
chr4    83739813        83749813
chr4    83749813        83759813

$ cat 10kb.window.bed |tail
chr4    94034099        94044099
chr4    94044099        94054099
chr4    94054099        94064099
chr4    94064099        94074099
chr4    94074099        94084099
chr4    94084099        94094099
chr4    94094099        94104099
chr4    94104099        94114099
chr4    94114099        94124099
chr4    94124099        94128608

And this is for my file:

$ cat der4_83.8mb-93.8mb.bed |head
chr4    83754146        83754197        HISEQ:373:H5G3FBCXX:1:1211:4092:69849/1 0       -
chr4    83754146        83754197        HISEQ:373:H5G3FBCXX:1:1211:4092:69849/1 0       -
chr4    83769677        83769728        HWI-D00108:359:H5G2WBCXX:1:1103:6159:7143/1     25      -
chr4    83769677        83769728        HWI-D00108:359:H5G2WBCXX:1:1103:6159:7143/1     25      -
chr4    83769680        83769732        HISEQ:373:H5G3FBCXX:2:1214:1270:69568/1 37      +
chr4    83769680        83769732        HISEQ:373:H5G3FBCXX:2:1214:1270:69568/1 37      +
chr4    83769681        83769707        HISEQ:373:H5G3FBCXX:1:2115:5962:55329/2 29      +
chr4    83769681        83769707        HISEQ:373:H5G3FBCXX:1:2115:5962:55329/2 29      +
chr4    83769681        83769707        HISEQ:373:H5G3FBCXX:2:1205:1697:44909/2 29      +
chr4    83769681        83769707        HISEQ:373:H5G3FBCXX:2:1205:1697:44909/2 29      +

$ cat der4_83.8mb-93.8mb.bed |tail
chr4    94693017        94693068        HISEQ:373:H5G3FBCXX:1:1205:13294:78672/1        37      +
chr4    94693017        94693068        HISEQ:373:H5G3FBCXX:1:1205:13294:78672/1        37      +
chr4    94693017        94693068        HWI-D00108:359:H5G2WBCXX:1:1206:9419:39653/1    37      +
chr4    94693017        94693068        HWI-D00108:359:H5G2WBCXX:1:1206:9419:39653/1    37      +
chr4    94693017        94693068        HISEQ:373:H5G3FBCXX:1:1110:20503:10820/1        37      -
chr4    94693017        94693068        HISEQ:373:H5G3FBCXX:1:1110:20503:10820/1        37      -
chr4    94693018        94693069        HWI-D00108:359:H5G2WBCXX:1:1107:6331:46872/2    37      -
chr4    94693018        94693069        HWI-D00108:359:H5G2WBCXX:1:1107:6331:46872/2    37      -
chr4    94693040        94693061        HISEQ:373:H5G3FBCXX:1:1101:18283:23886/2        29      -
chr4    94693040        94693061        HISEQ:373:H5G3FBCXX:1:1101:18283:23886/2        29      -

Thanks for your help!

next-gen sequence software error alignment • 2.3k views
ADD COMMENT
1
Entering edit mode

That 10kb.window.bed file looks irregular. Do a search for '83812419' to see what i mean. Are these supposed to be reads? Are you storing read data in a bed file?

ADD REPLY
0
Entering edit mode

Thanks for answering!

I did a search for 83812419. I am not sure why there are repeats of it.

   $ grep 83812419 10kb.window.bed
    chr4    83809813        83812419
    chr4    83809813        83812419
    chr4    83809813        83812419
    chr4    83809813        83812419
    chr4    83809813        83812419
    chr4    83809813        83812419
    chr4    83809813        83812419
    chr4    83809813        83812419
    chr4    83809813        83812419
    chr4    83804837        83812419
    chr4    83804363        83812419

This 10kb.window.bed bedfile was made just to compute the coverage of aligned sequences on 10 kb windows that span chr4 from 83,000,000 bp to 93,000,000 bp using the bedtools coverage.

I made it by downloading the .bed file of chr4: 83 mb - 93 mb from UCSC genome browser query , and did this command to make 10 kb windows:

$ bedtools makewindows -b theUCSCchr4file.bed -w 10000000

The other file is a BED file converted from a BAM file. The BAM file was produced from mapping with BWA paired-end reads against hg19 (using Galaxy's BWA).

ADD REPLY
0
Entering edit mode

Tagging: Aaronquinlan

ADD REPLY

Login before adding your answer.

Traffic: 1783 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6