how to extract pair end reads from bam file using samtool
2
1
Entering edit mode
5.7 years ago
Bioinfonext ▴ 420

I map raw pair end raw read to cDNA and now want to extract only pair end reads which are mapped to the cDNA.

I am using this samtool command, but it also reads whose one is mapped but other not:

samtools view -b -F4 in.bam > out_mapped.bam

RNA-Seq • 12k views
0
Entering edit mode

Depending what aligner you used you may have lost the R1/R2 read designationin the fastq header (bbmap preserves that when mapping). So keep that in mind as you extract the reads.

3
Entering edit mode
5.7 years ago
Charles Plessy ★ 2.8k

The samtools flags command prints the summary of the flags that you can use to filter your data, and the SAM format specification explains them in details.

\$ samtools flags

About: Convert between textual and numeric flag representation
Usage: samtools flags INT|STR[,...]

Flags:
0x1 PAIRED        .. paired-end (or multiple-segment) sequencing technology
0x2 PROPER_PAIR   .. each segment properly aligned according to the aligner
0x4 UNMAP         .. segment unmapped
0x8 MUNMAP        .. next segment in the template unmapped
0x10    REVERSE       .. SEQ is reverse complemented
0x20    MREVERSE      .. SEQ of the next segment in the template is reversed
0x40    READ1         .. the first segment in the template
0x80    READ2         .. the last segment in the template
0x100   SECONDARY     .. secondary alignment
0x200   QCFAIL        .. not passing quality controls
0x400   DUP           .. PCR or optical duplicate
0x800   SUPPLEMENTARY .. supplementary alignment


In your case, you need to filter in the PAIRED reads, and filter out both the unmapped (UNMAP) and the mapped whose mate is not mapped (MUNMAP). Perhaps you will be interested in properly paired reads as well (PROPER_PAIR), but this is more stringent than what you asked in your question.

Altogether, this makes either -f 0x1 -F 0xC, or -f 0x2. Have a look at the result of the command samtools flags 0xC if you are puzzled about 0xC.

2
Entering edit mode

In order to get a good combination of flags, I usually use http://broadinstitute.github.io/picard/explain-flags.html. You tick what you want to select for (or filter for) and use the number for the -f / -F parameter.