bam-readcount reports all reference bases as N
2
1
Entering edit mode
10.0 years ago
Chris F. ▴ 20

Hi,

I'm running bam-readcount (commit 6c3f3ae901) on a few hundred bam files against a single reference fasta file (designated with -f). However, when I look at the output for any of the files, all of the reference bases, for any position, are N.

NODE_14_length_46_cov_1.239130  52      N       1       =:0:0.00:0.00:0.00:0:0:0.00:0.00:0.00:0:0.00:0.00:0.00  A:1:2.00:36.00:2.00:1:0:0.00:0.02:0.00:0:0.00:0.00:0.00 C:0:0.00:0.00:0.00:0:0:0.00:0.00:0.00:0:0.00:0.00:0.00    G:0:0.00:0.00:0.00:0:0:0.00:0.00:0.00:0:0.00:0.00:0.00  T:0:0.00:0.00:0.00:0:0:0.00:0.00:0.00:0:0.00:0.00:0.00  N:0:0.00:0.00:0.00:0:0:0.00:0.00:0.00:0:0.00:0.00:0.00

Is this a bug? The base in the reference sequence at this position is A.

>NODE_14_length_46_cov_1.239130
AGCTAACTGAGTTTATCACACTCAGTTAATGTCCATTTCACTTCACACATAACCTTACAG
ATCGGAAGATCTCGTA

Thanks!

bam-readcount • 3.4k views
ADD COMMENT
3
Entering edit mode

A total guess, but is there any chance the reference fasta file is wrong (i.e. not the one that was used to make the bams)? I seem to remember having a similar problem when I mixed up reference files doing similar things.

ADD REPLY
0
Entering edit mode

Thanks, but in this case, there's only a single reference file.

ADD REPLY
3
Entering edit mode
10.0 years ago
ernfrid ▴ 220

I see this if I run on the entire BAM file without specifying a list of positions or a region. This is a bug. I pushed a very quick fix out that should solve this issue. It's 4b6479a42d002d855eda6a45bca097756d493cdb. Does this fix the issue?

ADD COMMENT
0
Entering edit mode

Yup - reference bases are being reported now. Thanks!

ADD REPLY
1
Entering edit mode
10.0 years ago
tgi.tabbott ▴ 230

The only way I was able to reproduce this type of problem was by manually changing the reference fasta in a way that invalidates the index.

Assuming you have samtools handy, what is the result of:

samtools faidx ref.fa NODE_14_length_46_cov_1.239130:50-52

where ref.fa is your reference fasta? If you get something unexpected, try removing the "ref.fa.fai" file (or whatever yours is called) and running bam-readcount again (it will be automatically recreated).

One way this could happen is if you generated the fasta, ran bam-readcount, then regenerated the fasta with different contents. The fasta index would be stale in this case.

ADD COMMENT
0
Entering edit mode
~/data7/src/samtools/samtools faidx contigs.fa NODE_14_length_46_cov_1.239130:50-52
>NODE_14_length_46_cov_1.239130:50-52
TAA
ADD REPLY
0
Entering edit mode

I blew away the index and recreated it with bam-readcount (it's the same size). Still getting N for the reference base.

ADD REPLY

Login before adding your answer.

Traffic: 2442 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6