Samtools: append output
2
0
Entering edit mode
9.4 years ago
bongbang ▴ 80

I'm trying to combine outputs from various view queries each with its own criteria into one bam file. From my bash script:

samtools view -b -F 12 -f 50 $bam_file > $bam_inconsistent # Wrong orientation -/-
samtools view -b -F 60 -f 2 $bam_file >> $bam_inconsistent # Wrong orientation +/+
samtools view -b -F 14 $bam_file >> $bam_inconsistent # Wrong insert size

Unfortunately only the first command seems to take effect.How can I fix this? Thank you.

(BTW, if there's an easier way to accomplish what I want, I'm open to any suggestion, although I would still like to understand why the above doesn't work.) Thank you.

Edit: I've found out why this doesn't work. A binary cannot be appended. Looks like I'll have to use sam output and convert back to bam.

samtools • 3.6k views
ADD COMMENT
2
Entering edit mode
9.4 years ago

aside from the proper result you want to obtain, the rationale of the problem is wrong as you are trying to concatenate binary bam files as if they were plain text files. what you should do is to perform the 3 filters independently, each one generating its own bam file, and then perform a final step that would merge those 3 files into a single one:

samtools view -b -F 12 -f 50 $bam_file > $bam_inconsistent_1 # Wrong orientation -/-
samtools view -b -F 60 -f 2 $bam_file > $bam_inconsistent_2 # Wrong orientation +/+
samtools view -b -F 14 $bam_file > $bam_inconsistent_3 # Wrong insert size
samtools merge $bam_inconsistent_merged $bam_inconsistent_1 $bam_inconsistent_2 $bam_inconsistent_3

Edit: I just saw Pierre's comment above describing a oneliner that would do the work which looks much more elegant and efficient than this answer. I would definitely go for it. I'll leave this answer as I think that it highlights the underlying problem, although the best solution would be Pierre's (samtools view -f2 -h 1.bam ; samtools view -F4 1.bam; samtools -f8 1.bam) | samtools view -Sb -o merged.bam -

ADD COMMENT
1
Entering edit mode
9.4 years ago

option -F and -f use a logical OR, not a logical AND.

So -F 60 is exclude: read unmapped OR mate unmapped OR read reverse strand OR mate reverse strand. With those last two flags, you're going to remove most of your reads even if they're properly mapped.

if you need a fine filtering for your reads you can use my tool: SamJS

ADD COMMENT
0
Entering edit mode

I understand that. I'm actually trying to exclude the properly mapped reads. The last two commands are meant to add things that are not covered by the first, and they do according to the results I got from view -c (12, 35, and 64219 respectively for the three queries). The problem is that they are not adding anything when I try to output to file with the >> syntax. So I got only 12 reads in my file instead of 12+35+64219 expected. The problem here, I believe, is my unix syntax, not samtools.

ADD REPLY
1
Entering edit mode

Oh, I see. you need to use a simple redirection '>' and merge the new files using samtools merge

samtools merge merged.bam in.1.bam in.2.bam ...
ADD REPLY
0
Entering edit mode

Do you think I can use piping to obviate the need for temporary files?

ADD REPLY
1
Entering edit mode

yes but without using a binary format and the only first BAM needs to print the header (option -h). Something like:

(samtools view -f2 -h 1.bam ; samtools view -F4 1.bam; samtools -f8 1.bam) | samtools view -Sb -o merged.bam -
ADD REPLY
0
Entering edit mode

Many thanks. If you edit your answer to include this, I'll accept it. It'll make it easier for future readers with the same question.

ADD REPLY
1
Entering edit mode

FYI: each BAM file is ended with a special EOF block, so tools won't continue processing file once it's encountered, that's why tools report only 12 reads.

ADD REPLY
0
Entering edit mode

Actually correctly-written software should continue over embedded EOF blocks, exactly so that more BAM records could be appended to a BAM file (see ยง4.1.2 of the SAM spec). This didn't work for the O.P. because adding more BAM headers in the middle of the file won't work, and because some software has bugs in this admittedly esoteric area (including samtools at present).

ADD REPLY

Login before adding your answer.

Traffic: 2389 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6