Now need to use Pindel(split-read approach to identify CNV). I'm just wondering how I can extract one-end mapped reads from BWA result? Does samtools have such options, or we need to extract by parsing the sam/bam file according to flags?
Edit: Just another question. After BWA running, I actually do the filtering to pick out those mapping with MAPQ > 20; will this affect picking up unmapped/one-end mapped reads?
I guess this won't affect finding one-end mapped reads; but those two-end unmapped reads cannot be picked up.
Edit again: This is my another question: I ran the command:
samtools view -u -f 8 -F 260 map.bam > oneEndMapped.bam
And let's look at the line of oneEndMapped.bam:
HWI-ST150_0129:2:1:1414:2357#0 153 chr1 154432976 37 101M = 154432976 0 ACATACATATGTATGTAATATGTGATTATTACTGGTGGTTTGCTAAATAATGACAAATGGAAAAAAATATATGTCATCCAAAGAGAGCCACTGTTAATTTT BBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBdc^ddcacc\TdcdddddYbe_dffddcffcffecedecedaeeeedd\def^ffeff`c RG:Z:101127_s_2 XT:A:U NM:i:2 SM:i:37 AM:i:0 X0:i:1 X1:i:0 XM:i:2 XO:i:0 XG:i:0 MD:Z:16G20A63
How to interpret this: To me, this looks like both ends have the same sequences, and map to the same position. (leftmost coordinate of one read; and leftmost of another). Why do we say this represents one read maps while the other one doesn't map? To describe "one read maps while the other one doesn't map", in my mind, it should be like:
HWI-ST150_0129:2:1:1414:2357#0 153 chr1 154432976 37 101M = * *
samtools view -u -f 4 -F 264 map.bam > oneEndUnmapped.bam
This oneEndUnmapped.bam is only 8K, means nothing. I don't quite understand this.