how to extract pair end reads from bam file using samtool
Entering edit mode
5.7 years ago
Bioinfonext ▴ 420

I map raw pair end raw read to cDNA and now want to extract only pair end reads which are mapped to the cDNA.

I am using this samtool command, but it also reads whose one is mapped but other not:

samtools view -b -F4 in.bam > out_mapped.bam

RNA-Seq • 12k views
Entering edit mode

Depending what aligner you used you may have lost the R1/R2 read designationin the fastq header (bbmap preserves that when mapping). So keep that in mind as you extract the reads.

Entering edit mode
5.7 years ago
Charles Plessy ★ 2.8k

The samtools flags command prints the summary of the flags that you can use to filter your data, and the SAM format specification explains them in details.

$ samtools flags

About: Convert between textual and numeric flag representation
Usage: samtools flags INT|STR[,...]

    0x1 PAIRED        .. paired-end (or multiple-segment) sequencing technology
    0x2 PROPER_PAIR   .. each segment properly aligned according to the aligner
    0x4 UNMAP         .. segment unmapped
    0x8 MUNMAP        .. next segment in the template unmapped
    0x10    REVERSE       .. SEQ is reverse complemented
    0x20    MREVERSE      .. SEQ of the next segment in the template is reversed
    0x40    READ1         .. the first segment in the template
    0x80    READ2         .. the last segment in the template
    0x100   SECONDARY     .. secondary alignment
    0x200   QCFAIL        .. not passing quality controls
    0x400   DUP           .. PCR or optical duplicate
    0x800   SUPPLEMENTARY .. supplementary alignment

In your case, you need to filter in the PAIRED reads, and filter out both the unmapped (UNMAP) and the mapped whose mate is not mapped (MUNMAP). Perhaps you will be interested in properly paired reads as well (PROPER_PAIR), but this is more stringent than what you asked in your question.

Altogether, this makes either -f 0x1 -F 0xC, or -f 0x2. Have a look at the result of the command samtools flags 0xC if you are puzzled about 0xC.

Entering edit mode

In order to get a good combination of flags, I usually use You tick what you want to select for (or filter for) and use the number for the -f / -F parameter.


Login before adding your answer.

Traffic: 1774 users visited in the last hour
Help About
Access RSS

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6