Bwa - How To Use Xm Tag
1
1
Entering edit mode
11.8 years ago
Arpssss ▴ 40

I have installed BWA. And build index of hg19 using command:

bwa index -a bwtsw hg19.fa

Now I find alignment using command:

./bwa aln hg19.fa SRR44930951.fastq > alnsa.sai

However, I want to find alignments allowing 1 or 2 mismatches.

From BWA home page, I found, I have to use XM tag.

But, can't get how to use that means what should be the command.

Can anybody please help me on this ?

Thanks.

bwa genome • 8.2k views
ADD COMMENT
0
Entering edit mode
ADD REPLY
0
Entering edit mode

Do you want to align reads with maximum 1 or 2 mismatches (while running BWA) or you want to find reads (eg. in SAM file) having maximum 1 or 2 mismatches after mapping?

ADD REPLY
0
Entering edit mode

I want to align reads with maximum 1 or 2 mismatches (while running BWA).

ADD REPLY
0
Entering edit mode

Hmm. But find no answer without pointer :).

ADD REPLY
0
Entering edit mode

you need combine XM and NM tags together. And also it depends on what is your definitions on mismatches. If sub, indels are all included, then bwa wont give the number directly out of your screen. Recovery the alignment with CIGAR ,read and reference would be the most correct way but time consuming.

ADD REPLY
0
Entering edit mode

@jingtao09, In my case "definitions on mismatch" means number of insertion/deletion/substitution occurred (as specified in BWA paper). NM (Edit distance) will be equally costly. However, I can't understand how to include XM tag in ./bwa aln hg19.fa SRR44930951.fastq > alnsa.sai command.

Another point is that, in some places I found - n 0/1/2 this tag gives specified number of mismatch. However, can't understand whether is that true or not. Little bit confusing because some places told to use XM (but don't specify how) some places told to use - n 0/1/2 .

ADD REPLY
0
Entering edit mode

XM tag will be reported in your sam file by default.

ADD REPLY
0
Entering edit mode

Means, if I want to find mismatches, allowing 0/1/2 errors, I have to just run ./bwa aln hg19.fa SRR44930951.fastq > aln.sa.sai command. Later I can found 0/1/2 mismatches from aln.sa.sai ? Actually, I am comparing, BWA vs BowTie. For, BowTie, I can specify output allowing mismatch by command ./bowtie --all -v 0 hg19 SRR4930952.fastq SRR4930952.txt (v specify number of mismatch). I want same command to compare with BWA ( it's paper say's, it allows).

ADD REPLY
0
Entering edit mode

so you can do one time bwa -n4, then you can subtract all n=0,1,2,3 reads. No easy way to compare to bowtie with the edit distance definitions. Cause bwa using Smith-Waterman algo, which assign different compensations to sub and indels. The best you cand do is to retrieve the alignment and calculate the mismatches by your scripts. otherwise, you can do statistically comparison. it can save your alot time

ADD REPLY
1
Entering edit mode
11.8 years ago

You will need to generate a SAM/BAM file from the sai file, then filter it with a simple string matching program grep/perl/python/awk to filter for the tags in the alignment column:

samtools view tmp.bam | grep -E "XM:i:(1|2)" | more
ADD COMMENT
0
Entering edit mode

But, it means from sai files I extract reads that have 1 or 2 mismatches. But, what I want is:

Actually, I am comparing, BWA vs BowTie. For, BowTie, I can specify output allowing mismatch by command

./bowtie --all -v 0 hg19 SRR4930952.fastq SRR4930952.txt (v specify number of mismatch).

I want same command to compare with BWA ( it's paper say's, it allows).

So, what command should I give to find exact matches and allowing 1/2 mismatches (I am using 150 bp read).

ADD REPLY
0
Entering edit mode

you cannot extract from sai file, but the final sam file.

ADD REPLY

Login before adding your answer.

Traffic: 1497 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6