Question: bam-readcount reports all reference bases as N
1
gravatar for Chris F.
6.8 years ago by
Chris F.10
United States
Chris F.10 wrote:

Hi,

I'm running bam-readcount (commit 6c3f3ae901) on a few hundred bam files against a single reference fasta file (designated with -f).  However, when I look at the output for any of the files, all of the reference bases, for any position, are N.  

NODE_14_length_46_cov_1.239130  52      N       1       =:0:0.00:0.00:0.00:0:0:0.00:0.00:0.00:0:0.00:0.00:0.00  A:1:2.00:36.00:2.00:1:0:0.00:0.02:0.00:0:0.00:0.00:0.00 C:0:0.00:0.00:0.00:0:0:0.00:0.00:0.00:0:0.00:0.00:0.00    G:0:0.00:0.00:0.00:0:0:0.00:0.00:0.00:0:0.00:0.00:0.00  T:0:0.00:0.00:0.00:0:0:0.00:0.00:0.00:0:0.00:0.00:0.00  N:0:0.00:0.00:0.00:0:0:0.00:0.00:0.00:0:0.00:0.00:0.00

Is this a bug?  The base in the reference sequence at this position is A.

>NODE_14_length_46_cov_1.239130
AGCTAACTGAGTTTATCACACTCAGTTAATGTCCATTTCACTTCACACATAACCTTACAG
ATCGGAAGATCTCGTA

 

Thanks!

bam-readcount • 2.3k views
ADD COMMENTlink modified 6.8 years ago by ernfrid210 • written 6.8 years ago by Chris F.10
3

A total guess, but is there any chance the reference fasta file is wrong (i.e. not the one that was used to make the bams)? I seem to remember having a similar problem when I mixed up reference files doing similar things.

ADD REPLYlink modified 13 months ago by _r_am32k • written 6.8 years ago by matted7.3k

Thanks, but in this case, there's only a single reference file.

ADD REPLYlink modified 13 months ago by _r_am32k • written 6.8 years ago by Chris F.10
3
gravatar for ernfrid
6.8 years ago by
ernfrid210
United States
ernfrid210 wrote:

I see this if I run on the entire BAM file without specifying a list of positions or a region. This is a bug. I pushed a very quick fix out that should solve this issue. It's 4b6479a42d002d855eda6a45bca097756d493cdb. Does this fix the issue?

ADD COMMENTlink modified 13 months ago by _r_am32k • written 6.8 years ago by ernfrid210

Yup - reference bases are being reported now. Thanks!

ADD REPLYlink modified 13 months ago by _r_am32k • written 6.8 years ago by Chris F.10
1
gravatar for tgi.tabbott
6.8 years ago by
tgi.tabbott230
United States
tgi.tabbott230 wrote:

The only way I was able to reproduce this type of problem was by manually changing the reference fasta in a way that invalidates the index.

Assuming you have samtools handy, what is the result of:

samtools faidx ref.fa NODE_14_length_46_cov_1.239130:50-52

where ref.fa is your reference fasta? If you get something unexpected, try removing the "ref.fa.fai" file (or whatever yours is called) and running bam-readcount again (it will be automatically recreated).

One way this could happen is if you generated the fasta, ran bam-readcount, then regenerated the fasta with different contents. The fasta index would be stale in this case.

ADD COMMENTlink modified 13 months ago by _r_am32k • written 6.8 years ago by tgi.tabbott230
~/data7/src/samtools/samtools faidx contigs.fa NODE_14_length_46_cov_1.239130:50-52
>NODE_14_length_46_cov_1.239130:50-52
TAA
ADD REPLYlink modified 13 months ago by _r_am32k • written 6.8 years ago by Chris F.10

I blew away the index and recreated it with bam-readcount (it's the same size). Still getting N for the reference base.

ADD REPLYlink modified 13 months ago by _r_am32k • written 6.8 years ago by Chris F.10
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1581 users visited in the last hour
_