Question: how to get total read count, mapped read count and unmapped read count from multiple sam or bam files
0
gravatar for Bioinfonext
5 months ago by
Bioinfonext320
Korea
Bioinfonext320 wrote:

Hi,

I have got multiple sam files, could anyone please suggest a loop to get the total read count in pair, total mapped read count in pair and total unmapped read count in pair in a tab-delimited file with a file name.

sam file name is like this:

Leaf_T3_S_R2_S45_L001.sam
Leaf_T3_F_R9_S12_L001.sam
Root_T3_F_R7_S29_L001.sam

Many thanks

phython bash R rnaseq • 272 views
ADD COMMENTlink written 5 months ago by Bioinfonext320
1

you are asking two questions: a) how to loop over files b) how to extract mapped counts. a) is not really a topic for this site and you can find answers on stackoverflow or google. b) was answered multiple times.

ADD REPLYlink written 5 months ago by Ido Tamir5.1k
1

Use samtools idsxtats.

ADD REPLYlink written 5 months ago by GenoMax96k

Hi genomax,

I have converted sam files to bam and I am trying to use below commnad for getting total unmapped reads number by using reformat.sh from bbtools but it is not giving any output (empty files) ? could you please suggest what is the error?

file names are like this;
Leaf_T1_S_R7_S34_L001.bam
Leaf_T1_S_R8_S33_L001.bam
Leaf_T1_S_R9_S32_L001.bam

batch command:

#!/bin/bash
#SBATCH --job-name=unmppaed_read     # Job name
#SBATCH --partition=k2-lowpri
#########################################################
#RUN_reformat.sh
module add bbtools/38.63 
    reformat.sh -in=Leaf_T1_F_R10_S1_L001.bam out=Leaf_T1_F_R10_S1_L001.unmapped.bam unmappedonly >Leaf_T1_F_R10_S1_L001.unmapped.COUNT.txt
     reformat.sh -in=Leaf_T1_F_R2_S5_L001.bam out=Leaf_T1_F_R2_S5_L001.unmapped.bam unmappedonly >Leaf_T1_F_R2_S5_L001.unmapped.COUNT.txt
     reformat.sh -in=Leaf_T1_F_R3_S4_L001.bam out=Leaf_T1_F_R3_S4_L001.unmapped.bam unmappedonly >Leaf_T1_F_R3_S4_L001.unmapped.COUNT.txt

Thanks

ADD REPLYlink modified 5 months ago • written 5 months ago by Bioinfonext320

There are no - in front of BBTools options. Just in=. I am not sure that redirector will work. Will have to check.

ADD REPLYlink written 5 months ago by GenoMax96k

thanks genomax,

this command is working perfectly;

reformat.sh in=Leaf_T1_F_R10_S1_L001.bam out=Leaf_T1_F_R10_S1_L001.unmapped.bam, unmappedonly

but as you said redirection is not working!

Many thanks

ADD REPLYlink modified 5 months ago • written 5 months ago by Bioinfonext320
1

You could use a #SBATCH -e file.err in your script, which will collect the stats output for all three jobs in file.err, via SLURM log. You will need to parse out the individual results afterwards.

ADD REPLYlink modified 5 months ago • written 5 months ago by GenoMax96k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 2080 users visited in the last hour
_