extract unmapped pair end read from sam files
0
1
Entering edit mode
4.4 years ago
Bioinfonext ▴ 460

Hi,

I do have multiple pair end sam files. I am thinking to use below command to extract unmapped read from all sam files to bam files by using below command and after this I am trying to use reformat.sh from bbtools to extract the unmapped read pair from bam file.

Is there any way to run the reformat.sh on all unmapped bam files to extract the pair end reads or any other best way to do it.

ls *.sam | xargs -n 1 -I {} sh -c 'samtools view -f 4 -b {} > {}.bam'

reformat.sh in=x.bam out=z.fq unmappedonly primaryonly

thanks nabiyogesh

RNA-Seq linux samtools bbtools • 1.3k views
ADD COMMENT
0
Entering edit mode

Why not try something like (BTW: I am using your command line above which appears to be incorrect). Reformat.sh should work with SAM files as well so you don't need to convert them just for this.

for i in *.bam ; 
do name=$(basename ${i} .bam); 
reformat.sh in=${name}.bam out=${name}.fq unmappedonly primaryonly; 
done
ADD REPLY
0
Entering edit mode

Hi Genomax,

thanks for your all help. above command seems very confusing for me as it only generate a single fastq files if I run it on sam file as well.

instead of this I am using the below three command in loop command one by one as suggest by you earlier: if I come across any issue, will let you know.

$samtools view -Sb -f12 sample.sam > sample.unmapped.bam


$ samtools sort -n sample.unmapped.bam sample.unmapped.qsort.bam

$ bedtools bamtofastq -i sample.unmapped.qsort.bam \
                      -fq sample.unmapped.end1.fq \
                      -fq2 sample.unmapped.end2.fq

#!/bin/bash
#########################################################
#RUN_HTseq

module add apps/samtools/1.9/gcc-4.8.5

module add apps/bedtools/2.25.0/gcc-4.8.5

 for i in `ls -1 *L001.sam*| sed 's/\_L001.sam//'`; do echo "samtools view -Sb -f12 ${i}_L001.sam  >${i}_unmapped.bam" >> cmd2_file;done

~

Thanks nabiyogesh

ADD REPLY
0
Entering edit mode

As I said in my earlier comment the command you had was what I reused. It was not correct, if you had paired end data.

You will need to use for that case:

reformat.sh in=${name}.bam out1=${name}_R1.fq out2=${name}_R2.fq unmappedonly primaryonly
ADD REPLY

Login before adding your answer.

Traffic: 2468 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6