Hi all, I'm playing with MAQ/BWA and I'm trying to map the following short read on chr22/hg19:
@repeat_1 GGTGTCTGGGACCAGAAAATAGTGAGGACCCTCTTAC + zzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzz
it should match twice at
22 + 22678160 22678196 22 + 22759880 22759916
when I use GWA , the read is mapped at 51304531 (?? 51304531+37 = length(chr22) ???)
bwa aln -N chr22.fa chr22.sai repeat.fastq > chr22.sai bwa samse chr22.fa chr22.sai repeat.fastq @SQ SN:Chr22 LN:51304566 repeat_1 4 Chr22 51304531 0 37M * 0 0 GGTGTCTGGGACCAGAAAATAGTGAGGACCCTCTTAC zzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzz XT:A:N NM:i:27 XN:i:36X0:i:4 X1:i:0 XM:i:0 XO:i:0 XG:i:0 MD:Z:2A0T0C2A0T0C0C0T1G1G0C0C0T1C0C0A0C0C2T0T1G0C0A0A0C0T0A0
when I use MAQ, only one position was reported.
I'm sure I'm missing something here, help ! :-)
Many thanks, pierre
Maq does not handle repeats, and will randomly report one of the mapped positions:
BWA has similar behavior by default, but the -R flag to aln will give you repeats:
Last time I checked, multiple hit reports are not in SAM format. For projects needing repeats, I normally use Bowtie:
For your question about the BWA output -- that SAM line indicates the read is unmapped. The easiest cue to pick up on is the 4 in the flag field. I'd expect it to give you one hit randomly assigned with either of those positions, so not sure why that is.