Question: Bwa - How To Use Xm Tag
1
gravatar for Arpssss
8.2 years ago by
Arpssss40
Arpssss40 wrote:

I have installed BWA. And build index of hg19 using command:

bwa index -a bwtsw hg19.fa

Now I find alignment using command:

./bwa aln hg19.fa SRR44930951.fastq > alnsa.sai

However, I want to find alignments allowing 1 or 2 mismatches.

From BWA home page, I found, I have to use XM tag.

But, can't get how to use that means what should be the command.

Can anybody please help me on this ?

Thanks.

genome bwa • 5.8k views
ADD COMMENTlink written 8.2 years ago by Arpssss40

Also asked at http://seqanswers.com/forums/showthread.php?t=21208

ADD REPLYlink written 8.2 years ago by Neilfws48k

Do you want to align reads with maximum 1 or 2 mismatches (while running BWA) or you want to find reads (eg. in SAM file) having maximum 1 or 2 mismatches after mapping?

ADD REPLYlink written 8.2 years ago by Vikas Bansal2.4k

I want to align reads with maximum 1 or 2 mismatches (while running BWA).

ADD REPLYlink written 8.2 years ago by Arpssss40

also asked at : http://sourceforge.net/mailarchive/forum.php?thread_name=CAJdGSff4KwmuqfNLCUBvrYuXTOmez1nnD%3DCurriPYzohtoPkQQ%40mail.gmail.com&forum_name=bio-bwa-help

ADD REPLYlink written 8.2 years ago by Pierre Lindenbaum130k

Hmm. But find no answer without pointer :).

ADD REPLYlink written 8.2 years ago by Arpssss40

you need combine XM and NM tags together. And also it depends on what is your definitions on mismatches. If sub, indels are all included, then bwa wont give the number directly out of your screen. Recovery the alignment with CIGAR ,read and reference would be the most correct way but time consuming.

ADD REPLYlink written 8.2 years ago by jingtao09110

@jingtao09, In my case "definitions on mismatch" means number of insertion/deletion/substitution occurred (as specified in BWA paper). NM (Edit distance) will be equally costly. However, I can't understand how to include XM tag in ./bwa aln hg19.fa SRR44930951.fastq > alnsa.sai command.

Another point is that, in some places I found - n 0/1/2 this tag gives specified number of mismatch. However, can't understand whether is that true or not. Little bit confusing because some places told to use XM (but don't specify how) some places told to use - n 0/1/2 .

ADD REPLYlink modified 8.2 years ago • written 8.2 years ago by Arpssss40

XM tag will be reported in your sam file by default.

ADD REPLYlink written 8.2 years ago by Vikas Bansal2.4k

Means, if I want to find mismatches, allowing 0/1/2 errors, I have to just run ./bwa aln hg19.fa SRR44930951.fastq > aln.sa.sai command. Later I can found 0/1/2 mismatches from aln.sa.sai ? Actually, I am comparing, BWA vs BowTie. For, BowTie, I can specify output allowing mismatch by command ./bowtie --all -v 0 hg19 SRR4930952.fastq SRR4930952.txt (v specify number of mismatch). I want same command to compare with BWA ( it's paper say's, it allows).

ADD REPLYlink modified 8.2 years ago • written 8.2 years ago by Arpssss40

so you can do one time bwa -n4, then you can subtract all n=0,1,2,3 reads. No easy way to compare to bowtie with the edit distance definitions. Cause bwa using Smith-Waterman algo, which assign different compensations to sub and indels. The best you cand do is to retrieve the alignment and calculate the mismatches by your scripts. otherwise, you can do statistically comparison. it can save your alot time

ADD REPLYlink written 8.2 years ago by jingtao09110
1
gravatar for Istvan Albert
8.2 years ago by
Istvan Albert ♦♦ 84k
University Park, USA
Istvan Albert ♦♦ 84k wrote:

You will need to generate a SAM/BAM file from the sai file, then filter it with a simple string matching program grep/perl/python/awk to filter for the tags in the alignment column:

samtools view tmp.bam | grep -E "XM:i:(1|2)" | more
ADD COMMENTlink written 8.2 years ago by Istvan Albert ♦♦ 84k

But, it means from sai files I extract reads that have 1 or 2 mismatches. But, what I want is:

Actually, I am comparing, BWA vs BowTie. For, BowTie, I can specify output allowing mismatch by command

./bowtie --all -v 0 hg19 SRR4930952.fastq SRR4930952.txt (v specify number of mismatch).

I want same command to compare with BWA ( it's paper say's, it allows).

So, what command should I give to find exact matches and allowing 1/2 mismatches (I am using 150 bp read).

ADD REPLYlink written 8.2 years ago by Arpssss40

you cannot extract from sai file, but the final sam file.

ADD REPLYlink written 8.2 years ago by jingtao09110
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1567 users visited in the last hour