Question

Extracting only paired reads from .bam

0

Entering edit mode

7.9 years ago

MS485 • 0

Hi,

I have the unmapped reads from a RNAseq experiement (unmapped.bam) and I want to try and map these to a different genome. I have sorted my unmapped.bam using

samtools view -n unmapped.bam unmappedsorted.bam

I have checked the files have the same number of reads. I then wanted to convert the unmappedsorted.bam to fastq format so that I may use tophat to map the 'unmappedsorted' to the other genome. I tried:

bedtools bamtofastq -i unmappedsorted.bam -fq unmappedsorted1.fq -fq2 unmappedsorted2.fq

But this gives an error something like: ... is marked as paired and the mate does not occur next to it. skipping.

Is there a way I can extract only the paired reads from my unmappedsorted.bam file to convert to fastq? Or is there something up with the unmappedsorted.bam that I'm misssing here?

Thanks in advance!

M

P.S. I am very new to linux/rna-seq analysis

RNA-Seq • 2.6k views

ADD COMMENT • link updated 7.9 years ago by GouthamAtla 12k • written 7.9 years ago by MS485 • 0

score 0 · Answer 1 · 2016-06-02

0

Entering edit mode

7.9 years ago

GouthamAtla 12k

Your command to sort the bam file is wrong. Use samtools sort. Anyway, I don't see any problem here. There could be singletons in the unmapped bam. So its skipping those reads. Whats the output of

samtools flagstat in.bam

Have a look at Picard's SamToFastq which is very elegant.

ADD COMMENT • link 7.9 years ago by GouthamAtla 12k

0

Entering edit mode

I have tried using sort -n rather than view -n. This results in the same error.

ADD REPLY • link 7.9 years ago by MS485 • 0