Question: Uniquly aligned reads by BWA-mem (MAPQ, and XA tag)
1
gravatar for biobiu
2.0 years ago by
biobiu110
United States
biobiu110 wrote:

I'm trying to filter out reads that were primarily aligned to one genomic region (if a read has secondary alignment I want to keep the primary). I've tried to do this with two methods- (i) filter out reads with MAPQ=0, which should have been aligned to more than one region. (ii) read with XA tag which should indicate alternative hits. However, there are read with MAPQ 0 but without XA tag (example 1) and reads with MAPQ>0 with XA tag.

Example 1- MAPQ 0 but without XA tag SRR1338241.3 83 chr5 11625 0 76M = 11507 -194 CCCTAACCCTAACCCTAACCCTAACCCTAACCCTAACCCTAACCCTAACCCTAACCCTAACCCTAACCCTAACCCT #####################################?8:8:BD?:EC8)9DEF?AB?<a++28ddd??d3dd?<? nm:i:0="" md:z:76="" as:i:76="" xs:i:76<="" p="">

Example 2- MAPQ>0 with XA tag SRR1338241.239 99 chr1 14643 25 76M = 14706 139 GTCAGAGCAATGGCCCAAGTCTGGGTCTGGGGGGGAAGGTGTCATGGAGCCCCCTACGATTCCCAGTCGTCCTCGT C@CFFFFFHHHHHIIIJJJFHIJJJEGGJJJJJDBDDDD8@CDDEDDDDCBDDDDDDDDDDDEDDDDDDBDDDDDD NM:i:1 MD:Z:10C65 AS:i:71 XS:i:71 XA:Z:chr9,+14754,76M,1;chr15,-102516447,76M,2;chr16,+64329,13M1I62M,2;chr12,-90907,76M,3;chr2,-114356295,76M,3;

Does anyone has any explanation to this discrepancy? or any other suggestion to do it?

next-gen alignment • 1.6k views
ADD COMMENTlink modified 22 months ago by Biostar ♦♦ 20 • written 2.0 years ago by biobiu110
1

Can you give your command line you used ?

ADD REPLYlink written 2.0 years ago by Titus910

For mapping? bwa-0.7.12/bwa mem -t 16 hg19.bwa_index/hg19 file_1.fastq file_2.fastq > file.sam 2> file.log‏ Thanks

ADD REPLYlink written 2.0 years ago by biobiu110

Thank you , so your 2 examples of reads are examples after filtering methods ? Haw did you filter MAPQ 0 and XA ? Did you used command line (like this in this post BWA mem how to know if a read is mapped uniquely? ) ?

ADD REPLYlink written 2.0 years ago by Titus910

Thanks. No its before filtering. My problem is not that I have technical problem to filter it out (which can be done easily by awk or grep), my problem is that I'm actually don't understand how it make sense that some reads have MAPQ 0 without XA tag and others have XA with MAPQ greater than 0. So I believe that I'm missing something here, and maybe there is easier way to filter these out.

ADD REPLYlink written 2.0 years ago by biobiu110

If i remember correctly your second case ( XA with MAPQ greater than 0 ) you can detect homologous genes with. For the MAPQ 0 in general it's low complexity region like in your example. So it's interesting if you work on this type of problematic ? (May be i m wrong ^^ )

ADD REPLYlink written 2.0 years ago by Titus910

Maybe, but since I have to know that my read were mapped correctly I have to filter these cases out.

ADD REPLYlink written 2.0 years ago by biobiu110

Hello, you may want to take a look here: Bwa Mapping Qualities Of 0

This also makes for interesting reading: More madness with MAPQ scores (a.k.a. why bioinformaticians hate poor and incomplete software documentation)

ADD REPLYlink written 2.0 years ago by Kevin Blighe52k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 708 users visited in the last hour