bam-readcount not finding reference in bam file
1
0
Entering edit mode
5.8 years ago
obuzko • 0

Hi, I'm trying to run bam-readcount with region information (as a workaround to the 8000 read limitation). I use version 0.8 cloned from the repository.

The reference FASTA file (reference_1.fa) has this header:

>reference_1:1-35911

It's indexed: bwa index reference_1.fa

I run the alignment:

bwa mem -t 8 reference_1.fa -p input.fastq | samtools sort -@8 -O BAM -o alignment_reference_1.bam -

Output is indexed:

samtools index alignment_reference_1.bam

At this point, I need to get base counts at each position in the reference. I created a separate file with region information (reference_1_regions.txt) that contains one line:

reference_1       1       35911 (tab-delimited)

The actual command:

bam-readcount -w 1 -d 1000000 -l reference_1_regions.txt -f reference_1.fa alignment_reference_1.bam > readcount_reference_1.txt

The output:

reference_1 not found in bam file. Region reference_1 1 35911 skipped.

I checked the bam file header with samtools view -H and here's the output:

@HD VN:1.5  SO:coordinate
@SQ SN:reference_1:1-35911  LN:35911
@PG ID:bwa  PN:bwa  VN:0.7.17-r1188 CL:bwa mem -t 8 reference_1.fa -p input.fastq

If anyone has any suggestions about the reason for this error, please let me know. Is this a bug in bam-readcount or am I missing something?

Thank you in advance

Sasha

bam-readcount bam • 1.7k views
ADD COMMENT
0
Entering edit mode

Thank you! The funny thing is that it worked as some kind of a hack. I did try just using >reference_1 in the FASTA header and then "reference_1 1 35911" in the regions file, and it didn't work. Bam-readcount kept saying that reference_1 not found in FASTA file with a core dump.

However, when FASTA file contains ">reference_1:1-35911" and the regions file has "reference_1:1-35911 1 35911", it works. Go figure... In any event, thanks for the hint.

ADD REPLY
0
Entering edit mode

Please use ADD COMMENT or ADD REPLY to answer to previous reactions, as such this thread remains logically structured and easy to follow. I have now moved your reaction but as you can see it's not optimal. Adding an answer should only be used for providing a solution to the question asked.

If an answer was helpful you should upvote it, if the answer resolved your question you should mark it as accepted.
Upvote|Bookmark|Accept

ADD REPLY
2
Entering edit mode
5.8 years ago
ATpoint 82k

@SQ SN:reference_1:1-35911 LN:35911 Blockquote

As you can see from this line, the chromosome name is reference_1:1-35911, not reference_1. Change that in your reference_1_regions.txt. Are you trying to get an output for an entire BAM file? What do you want to do with it. The only use of the output I know of is together with the fpfilter of VarScan2, but I might be wrong.

ADD COMMENT

Login before adding your answer.

Traffic: 1830 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6