Hi everyone, I am looking for a conclusive way to obtain uniquely mapped reads from Bowtie2. I checked the manual of Bowtie2, which shows higher mapping quality is equal to more unique. I applied the quality 10 as a threshold and extracted reads above 10 MAPQ by samtools. (samtools view -q 10).
Basically, the alignment stats are:
67870428 (100.00%) were unpaired; of these: 25785662 (37.99%) aligned 0 times 27964580 (41.20%) aligned exactly 1 time 14120186 (20.80%) aligned >1 times
62.01% overall alignment rate = 42084766 reads.
As the general definition of uniquely mapped reads, the alignment should be exactly one time and I tried to remove 14120186 (20.80%) aligned >1 times these reads. However -q 10 does not remove all the reads (Kept in total 31850072)). So I increased -q to 20 . It kept 31564096, I also set -q 42 (Note max quality in the output sam I found was 42), which given me 26154743 reads, which is lesser than reads aligned exactly 1 time. I also checked online forums, and few people suggested to remove XS:i: tag, So I used this command as suggested here:
grep -E "@|NM:" bt2output.sam | grep -v "XS:" > uniq_bt2output.sam
It has given me 27964580 exactly what I wanted.
However, I am not sure if it is the correct way as I am removing the second alignment.
The issue is which value to set to extract unique reads or is there any specific tag inside sam which can allow me to fetch only uniquely mapped reads. Is there any specific option during alignment (like Bowtie -m 1, I didn't find equivalent in Bowtie2)?
I am not clear about the right or appropriate way.
I would appreciate any help. Thanks you Ankit