Question: how to extract pair end reads from bam file using samtool
gravatar for Bioinfonext
2.5 years ago by
Bioinfonext150 wrote:

I map raw pair end raw read to cDNA and now want to extract only pair end reads which are mapped to the cDNA.

I am using this samtool command, but it also reads whose one is mapped but other not:

samtools view -b -F4 in.bam > out_mapped.bam

rna-seq • 5.1k views
ADD COMMENTlink modified 21 months ago by Biostar ♦♦ 20 • written 2.5 years ago by Bioinfonext150

Depending what aligner you used you may have lost the R1/R2 read designationin the fastq header (bbmap preserves that when mapping). So keep that in mind as you extract the reads.

ADD REPLYlink modified 2.5 years ago • written 2.5 years ago by genomax69k
gravatar for Charles Plessy
2.5 years ago by
Charles Plessy2.7k
Charles Plessy2.7k wrote:

The samtools flags command prints the summary of the flags that you can use to filter your data, and the SAM format specification explains them in details.

$ samtools flags

About: Convert between textual and numeric flag representation
Usage: samtools flags INT|STR[,...]

    0x1 PAIRED        .. paired-end (or multiple-segment) sequencing technology
    0x2 PROPER_PAIR   .. each segment properly aligned according to the aligner
    0x4 UNMAP         .. segment unmapped
    0x8 MUNMAP        .. next segment in the template unmapped
    0x10    REVERSE       .. SEQ is reverse complemented
    0x20    MREVERSE      .. SEQ of the next segment in the template is reversed
    0x40    READ1         .. the first segment in the template
    0x80    READ2         .. the last segment in the template
    0x100   SECONDARY     .. secondary alignment
    0x200   QCFAIL        .. not passing quality controls
    0x400   DUP           .. PCR or optical duplicate
    0x800   SUPPLEMENTARY .. supplementary alignment

In your case, you need to filter in the PAIRED reads, and filter out both the unmapped (UNMAP) and the mapped whose mate is not mapped (MUNMAP). Perhaps you will be interested in properly paired reads as well (PROPER_PAIR), but this is more stringent than what you asked in your question.

Altogether, this makes either -f 0x1 -F 0xC, or -f 0x2. Have a look at the result of the command samtools flags 0xC if you are puzzled about 0xC.

ADD COMMENTlink written 2.5 years ago by Charles Plessy2.7k

In order to get a good combination of flags, I usually use You tick what you want to select for (or filter for) and use the number for the -f / -F parameter.

ADD REPLYlink written 2.5 years ago by michael.ante3.3k
Please log in to add an answer.


Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1508 users visited in the last hour