Question: mapped/unmapped mate with samtools
0
gravatar for Flavia
4.9 years ago by
Flavia0
Pisa
Flavia0 wrote:

Hi,

I'm kind of newbie about BWA and I have a trivial question.

I performed an alignment with paired reads and now I'm trying to extract some of them with samtools.

1) mapped reads whose mate is unmapped (samtools view -f4 -F8 in.bam)

2)unmapped reads whose mate is mapped (samtools view -f8 -F4 in.bam)

The problem is that I did expect that the same number of reads in both files (they are complementary, aren't they?). But I obtained more reads among the unmapped.

I can't find any explanation. thanks in advance

ADD COMMENTlink written 4.9 years ago by Flavia0

FYI, -f 4 -F 8 will give unmapped reads with mapped mates and -f 8 -F 4 will give mapped reads with unmapped mates.

What's the output of samtools view -c -f 0x800 in.bam

ADD REPLYlink written 4.9 years ago by Devon Ryan92k

oh sorry..in the post I inverted the code for mapped/unmapped!!

however ..with -f 0x800 I obtain 0.. but I don't know the meaning

ADD REPLYlink written 4.9 years ago by Flavia0

BWA can produce chimeric/non-linear alignments, which will have the 0x800 bit in the flag set. I had hoped that that was the cause of the mismatch, but I guess not.

BTW, what sort of difference in number of reads are we talking about here. Just 1 or two or a significant number?

ADD REPLYlink written 4.9 years ago by Devon Ryan92k

the unmapped mates are 1453574; the mapped mates 1418724. so I have a difference of 34850..quite significant, I think.

ADD REPLYlink written 4.9 years ago by Flavia0

did you filter the BAM after aligning with bwa ?

ADD REPLYlink written 4.9 years ago by Pierre Lindenbaum123k

I don't think. I did the alignment and then I converted sam in bam

samtools view -bS in.sam > out.bam. that's it.

ADD REPLYlink written 4.9 years ago by Flavia0

So.. looking at the unmapped file I found (for example) this couple of reads:

HWI-ST1210:100:D19M1ACXX:6:1103:9250:71414    103    CL282Contig2    975    29    87M2I7M    CL282Contig5    647    0    AAAAGGCTAAAGGACAGACCGCACTGTCAACCTCCATACCCGTGCACTCAACGCTATCAACGTACAACAACTACAAGAAACAACGATACGTTCAAC    CCCFFFFFHGHHHIJJJJJJJJJJJJIJIJJJIJJJJJJJJJGHJJJJJIJJJHHHFFFFFDDEDDDDDDDDDDDDDDDCDD@BDDBBDB>ABDDD    XT:A:U    NM:i:5    SM:i:29    AM:i:29    X0:i:1    X1:i:0    XM:i:3    XO:i:1    XG:i:2    MD:Z:89G1G1A0


HWI-ST1210:100:D19M1ACXX:6:1103:9250:71414    151    CL282Contig5    647    29    55M45S    CL282Contig2    975    0    ATGACAATCCACCAGCACTGCCACAAAACACATTAAGAAATCCATCCTGAAATCAAAACTGTCTTCGCCATCGGAATCATCAATCATTTTCACCATCTTT    @53C?A:5<(9DCDCCCBCBA@DFFEEEEEDHHHIIJIHGHJIHHJIIGJIGHJJIEHJIIIIHGJJJIGGGJJJIJJJJJJJIJJHGHHFBFFFF@CBF    XT:A:M    NM:i:8    SM:i:29    AM:i:29    XM:i:8    XO:i:0    XG:i:0    MD:Z:6G2A2A10T11C0A8A1G7


I completly miss the point..why are they among the unmapped mates?

----------------------------------------------------------------------------------------------------------

I will try to edit this, because I can't submit new post (I'm a new user)

that's the version

Program: bwa (alignment via Burrows-Wheeler transformation)
Version: 0.7.5a-r405


and that's the code

bwa aln -t 4 -l 12 -n 4 -k2 -o 3 -e 3 -M 2 -O 6 -E 3 library_contigs_288CL.fa  R1_001.paired.fq > DNAseq288_R1.sai

bwa aln -t 4 -l 12 -n 4 -k2 -o 3 -e 3 -M 2 -O 6 -E 3 library_contigs_288CL.fa  R2_001.paired.fq  > DNAseq288_R2.sai

bwa sampe -a 5000 -N 5000 -n 500 library_contigs_288CL.fa DNAseq288_R1.sai DNAseq288_R2.sai R1_001.paired.fq R2_001.paired.fq >  outDNAseq288.sam


samtools view -bS outDNAseq288.sam > outDNAseq288.bam

thanks a lot

ADD REPLYlink modified 4.9 years ago • written 4.9 years ago by Flavia0

Those flags make absolutely no sense. What's the exact command you gave to bwa? Also, what version did you use.

ADD REPLYlink written 4.9 years ago by Devon Ryan92k

You'd probably be better off using a more recent version and seeing if the incorrectly unset 0x8 bit in the flag is set properly there.

ADD REPLYlink written 4.9 years ago by Devon Ryan92k

Does anybody know how I can get unmapped reads with unmapped mates? Will samtools view -f8 -F8 in.bam work?

ADD REPLYlink written 23 months ago by gzm830

You should ask a new question to get input from more people. Your comment will only reach the people in this thread.

That being said, the answer is ;-) :

You need to select for both UNMAP and MUNMAP so -f 4 -f 8 ought to do it.

ADD REPLYlink modified 23 months ago • written 23 months ago by Istvan Albert ♦♦ 81k
0
gravatar for Istvan Albert
4.9 years ago by
Istvan Albert ♦♦ 81k
University Park, USA
Istvan Albert ♦♦ 81k wrote:

the flags are weird but may not be incorrect. There is a big "hole" in the spec there that says:

Bit 0x4 is the only reliable place to tell whether the read is unmapped. If 0x4 is set, no
assumptions can be made about RNAME, POS, CIGAR, MAPQ, bits 0x2, 0x10, 0x100 and
0x800, and the bit 0x20 of the previous read in the template.

So what that says is that if the read is unmapped the rest of the fields may still be set but we need to ignore that since the information may be incorrect. Now if one were to match the flags directly they would get a different result. I often seen tools completely ignore the above and assume that if the read is unmapped the rest of the fields will be changed accordingly. That is not the case.

To the OP: I think the correct course of action is to first remove all reads where both the read and the mate are unmapped. Then on the remaining reads the flags will work correctly. We have every right to complain we shouldn't need to do that - I mean c'mon having an unmapped read that is  listed as mapping to say position xyz with a certain mapping quality - yes that actually can happen.

ADD COMMENTlink modified 4.9 years ago • written 4.9 years ago by Istvan Albert ♦♦ 81k

Well, 0x8 is incorrect, which is the source of the problem here.

ADD REPLYlink written 4.9 years ago by Devon Ryan92k

that could be  -  I have some duties right now so I can't look into it more deeply - from experience I found that when counts don't match up it is almost always because the filtering ignores the 0x4 state and matches the flags exactly as we tell it to. But those flags can actually be set 'incorrectly'. 

ADD REPLYlink modified 4.9 years ago • written 4.9 years ago by Istvan Albert ♦♦ 81k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 548 users visited in the last hour