How to filter unique reads from sam file?
3
0
Entering edit mode
6.6 years ago
Medhat 9.0k

How to extract unique mapped reads from sam alignment file of pair end reads

The mapping was done using bowtie2.

there is different posts for the same topic suggesting to use the 11 , 12 columns AS and XS

others say Unique alignments in bowtie2 have MAPQ>=2

and which option shall I use for bowtie2 to get an option like -m from bowtie ?

next-gen alignment sequencing bowtie2 • 5.8k views
3
Entering edit mode
6.6 years ago

It depends on how unique unique should be. The only reliable way to select only alignments for which the next best alignment has a lower score is by comparing the AS and XS tags (when they're the same, the MAPQ will be 0 or 1, though the MAPQ can also be 1 if these tags differ). Having said all of that, you're better off filtering at a MAPQ of 5 or so for most purposes.

There is no equivalent of -m in bowtie2 and there's no exact way to implement it in post-processing if you were to use -a.

0
Entering edit mode

So there is no way for bowtie2 to map the unique reads from the start " and -k option also will not do the job"

0
Entering edit mode

Yeah, -k isn't really a substitute, since if it has more than what you specify with -k then it'll just won't report some of them, rather than suppressing all of them. Perhaps bowtie2 sets NH appropriately in that case, but I rather doubt it.

0
Entering edit mode

Actually, you probably could just post-process the output of -a, though that's likely going to be slow.

2
Entering edit mode
6.6 years ago
Brice Sarver ★ 3.7k

This post basically covers everything: Bowtie2, -M Alignment/Reporting Mode

It includes discussion of the different -k options and the different columns that are added when a read is identified as multiply mapped.

1
Entering edit mode
6.6 years ago
Renesh ★ 2.0k

To find unique alignment from bowtie2, you can check the NH flag option in sam file. The unique alignment will have flag NH:i:1.

1
Entering edit mode

Haven't used bowtie2 for a while but if I remember correctly Bowtie2 doesn't generate "NH:" tag for reads in the SAM file. I usually go with concordantly mapped paired-end reads using "YT:Z:CP" tag rather than using MAPQ to extract reliable alignments.

0
Entering edit mode
0
Entering edit mode

it turns out that tag YT:Z:CP report reads "exact 1 time" and reads with "concordantly > 1 times" once. How can I filter the 'exact 1 time` reads only?

0
Entering edit mode

+1 Indeed, this is more convenient than comparing AS and XS tags (at least assuming one means "has a second best alignment with an inferior score to the best alignment" by "unique").