bam-readcount empty output
0
0
Entering edit mode
6.8 years ago

I have encountered similar questions elsewhere, but none exactly replicate my problem.

I ran bam-readcount for about 1000 sites (across all chromosomes) for two samples from the same individual. I know both bam files have reads at the sites in question, as these sites were called as variant by multiple somatic callers (mutect, varscan, etc). Yet, for one sample I get an empty output from bam-readcount, while for the other I get the expected counts. To illustrate the problem with 1 site, see below.

First, I make the input site list:

echo "1 884091 884091" | tr ' ' '\t' > test.bed

Second, I run bam-readcount:

bam-readcount --reference-fasta ref.fa --site-list test.bed sample1.bam > test1

bam-readcount --reference-fasta ref.fa --site-list test.bed sample2.bam > test2

When I do this, we see test1 is an empty file, whereas test2 has counts for the site.

Third, as a sanity chec, I use samtools to subset these bams using the same site list. This is to verify this site is indeed covered in both bam files:

samtools view -L test.bed sample1.bam > test3

samtools view -L test.bed sample2.bam > test4

When we do this, we see there is output for both files:

head -1 test3 test4

==> test3 <== HWI-D00310:198:CALR0ANXX:5:1116:18839:21292 99 1 883969 60 125M = 883971 127 GACCTGCTGGAACATCTGCCCCAAGGGCCGTGTCAGGCTCTCTCGGCCCCATGCCTGGTCACCCTGGCTTCACCCTGGCTGCACCCTGGTCCCCCTGGTCCCTTTGGCCCTGCACCTGGCTGCAC >??ADEEEGECCDDCEEGEDDDDCFEEED@BGBEDFEEEEEEEE@EEDDDDCGECEGEBEDEDDEGEEECEDEDDEFEEEGEDECDEGEBEDDDDEGCBDDDECCGEEDDEGEDEDEGEDCB?>@ MC:Z:125M BD:Z:KKMLKNNNMMLMMLMNKLLKIHJJKLHJJIJJILKMLJKJJJJJKJJJHHJLLKJHKKJLKJKHHKKJKIKKJKHHKKJKKKLJKHHKKJLKHHHILLKMLIIJEKLKKIILLMKLJMMLNLLMK MD:Z:125 RG:Z:BCALR0ANXX.5.TCCTGAGC-ACTGCATA BI:Z:PPSRUWUVVSTRROSSRTSQMMSPQPMRPPROOSRTPSTQRQRQTQSQMMSSTSQRTRSTSPRMRTRSTQRSPRMRTRSTTSSPRMRTRSTQNNNSUSTURNSRKTSTROTVUURTTVTUVTSSP NM:i:0 BQ:Z:@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@ MQ:i:60 AS:i:125 XS:i:19

==> test4 <== HWI-D00310:198:CALR0ANXX:4:2308:5529:20058 99 1 883970 60 125M = 883992 147 ACCTGCTGGAACATCTGCCCCAAGGGCCGTGTCAGGCTCTCTCGGCCCCATGCATGGTCACCCTGGCTTCACCCTGGCTGCACCCTGGTCCCCCTGGTCCCTTTGGCCCTGCACCTGGCTGCACC AA@12DE1@/ADCDFFFA-.AC=E/DED>AFAECGACFA0CDD?CD7?CDDGBABFEBFCBCEFGCFEB/CB.?DGEFFCF.DAEDGE-B/AEEDD<9EABFCBGDE//=0?0DDD0DCDD?>@@ MC:Z:125M BD:Z:KKLJPPNOMLMMJMNKLLKHHJJJLGIJHJIHKKMLJKJJJJJKKJKHHJLLLLLLKJLKJKHILKJKIKKJKHILKJKLLLJKHILKJLKHHHILKJLKIJJEKLKLIJMMMKLJNMLMOMMKL MD:Z:53C71 RG:Z:BCALR0ANXX.4.TCCTGAGC-TATCCTCT BI:Z:PPQRUUVTSSQQORSPRRPLLQNQPMRPPROORQRPRSOPOPORPRPLLQQRRRQRQRRQNPLQRQSTPRROQLRSRSTSSSOQMRSRSSQMMMRSRSSQMRPJSSTRNSTTTPSTUTUVUSSOQ NM:i:1 BQ:Z:@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@ MQ:i:60 AS:i:120 XS:i:20

If anyone has seen similar behavior and has any suggestions, would be greatly appreciated.

bam-readcount • 2.9k views
ADD COMMENT
0
Entering edit mode

https://github.com/genome/bam-readcount

The list of regions should be formatted as chromosome start and end. Each field should be tab separated and coordinates should be 1-based.

Could that be a problem that your list of regions is in bed-format?

ADD REPLY
0
Entering edit mode

The file is formatted as you describe. Furthermore, I don't believe it is a problem with the region list file because this file is working for sample2.bam

ADD REPLY
0
Entering edit mode
1 884091 884091

That's in bed-format!

Furthermore, I don't believe it is a problem with the region list file because this file is working for sample2.bam

you never know the internal working mechanisms of such programs, so it is better to follow exactly what is being said in the manual

ADD REPLY
0
Entering edit mode

Again, the test.bed file in my example is formatted as the manual instructs: three tab-separated columns (chr, start, end). The start and end having the same position does not matter for bam-readcount. To be sure, I have run this using different iterations of the region list file and the same issue occurs. Its very strange

ADD REPLY
0
Entering edit mode

Please use ADD COMMENT or ADD REPLY to answer to previous reactions, as such this thread remains logically structured and easy to follow. I have now moved your post but as you can see it's not optimal. Adding an answer should only be used for providing a solution to the question asked.

ADD REPLY

Login before adding your answer.

Traffic: 2896 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6