I have paired end data which I am mapping (using bwa - mem) to 17 viral references which are ~85% similar to one another (i.e I have one reference file that I give to bwa mem which has 17 different fastqs). So, obviously, I have most of my reads mapping to multiple references. Which is needed for my analysis.
However, I also want to get a count of uniquely mapped reads, i.e, if a read maps to 6/17 references, I want to count it only once. I have tried to filer my bam file based on quality or using grep on XA and XS tags but this removes all the 6 mapping.
samtools view -h sorted.bam | grep -v -e 'XA:Z:' -e 'SA:Z:' | samtools view -b > unique.bam samtools view -h sorted.bam | grep -v -e 'XA:Z:' -e 'SA:Z:' | samtools view -b > unique.bam
Is there a way I can keep one instance of those 6 mappings?