How To Force 'Bwa Samse' To Output Multiple Hits In .Sam Format?
1
2
Entering edit mode
11.9 years ago

I am using 'bwa' to map Illumina reads to a large database and would like to be able to store (and analyse) the information on multiple hits. I have forced this very brutishly using the following:

bwa samse -n 1000 database.fasta aln_sa.sai reads.fasta > aln_1000.sam

However, the output is no longer in classical .sam format (i.e. no flags, no CIGAR string, ...). Would someone know how to force a real .sam output?

Just as an example, here is the kind of output I get:

>1-815511 3 3
bta-mir-21    +8    0
mRNA___ENSBTAG00000029897|ENSBTAT00000042276|bta-mir-21|bta-mir-21    +18    0
19    +11033079    0
>2-592417 3 3
bta-mir-21    +8    0
mRNA___ENSBTAG00000029897|ENSBTAT00000042276|bta-mir-21|bta-mir-21    +18    0
19    +11033079    0
...

What I have tried so far with no luck:

  • use the xa2multi.pl script which parses the information in the XA:Z: (information on other hits when their number is inferior to the value of the -n option) to output one line per hit -- this scripts is now part of the 'bwa' package.

Unfortunately, the output of a normal:

bwa samse database.fasta aln_sa.sai reads.fasta > aln.sam

does not contain any XA:Z: field... Even for read 2-592417 which has several hits:

1-815511    0    mRNA___ENSBTAG00000029897|ENSBTAT00000042276|bta-mir-21|bta-mir-21    18    0    23M    *    0    0    TAGCTTATCAGACTGATGTTGAC    *    XT:A:R    NM:i:0    X0:i:3    X1:i:0    XM:i:0    XO:i:0    XG:i:0    MD:Z:23
2-592417    0    19    11033079    0    22M    *    0    0    TAGCTTATCAGACTGATGTTGA    *    XT:A:R    NM:i:0    X0:i:3    X1:i:0    XM:i:0    XO:i:0    XG:i:0MD:Z:22
  • install the previous bwa-0.5.7 version for which a patch doing exactly what I need has been developed at Broad Institute. But the patch has vanished from its ftp site and is now unavailable.

Any ideas?

bwa sam • 5.9k views
ADD COMMENT
2
Entering edit mode

I just tried this on a test data generated from the yeast genome and bwa version 0.5.9 seems to fill out all the SAM fields as expected - what version are you running?

ADD REPLY
0
Entering edit mode

I was actually running 0.5.5 (!) but have now installed 0.6.1 and will run it to see if the problem is solved. Thanks a lot!

ADD REPLY
0
Entering edit mode

Perfect! In the latest bwa version, the output is in a fully populated SAM file (XA:Z: field is now present). I just then run the xa2multi.pl on it to obtain one hit per line.

ADD REPLY
0
Entering edit mode

Could you please add this as an answer. Thanks.

ADD REPLY
4
Entering edit mode
11.9 years ago

Thanks to Istvan's suggestion, in the latest bwa version, the output is in a fully populated SAM file (XA:Z: field is now present). I just then run the xa2multi.pl on it to obtain one hit per line.

ADD COMMENT

Login before adding your answer.

Traffic: 2981 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6