Question: Multiple sam to bam followed by raw read count
0
gravatar for Bioinfonext
21 months ago by
Bioinfonext120
Korea
Bioinfonext120 wrote:

I have multiple sam file like this:

sam file location:

/data/SNU_work/Analysis/mapped/mapped

samtool location:

/home/yog/software/samtools-1.3.1/samtools

218_9W_Pa2.sam

218_9W_Pa1.sam

216_7W_Co1.sam

216_7W_Ca2.sam

216_7W_Ca1.sam

I converting them one by one using below commnad:

/home/yog/software/samtools-1.3.1/samtool view  -b 218_9W_Pa2.sam > 218_9W_Pa2.bam

After that I am extracting mapped read from bam file and sorting of bam using below cammnd:

 /home/yog/software/samtools-1.3.1/samtools view -b -F4 218_9W_Pa2.bam > 218_9W_Pa2.mapped.bam


/home/yog/software/samtools-1.3.1/samtools sort 218_9W_Pa2.mapped.bam -o 218_9W_Pa2_mapped_sort.bam

After sorting bam I also want to do index bam file and follwed by raw read count:

  For indexing: /home/yog/software/samtools-1.3.1/samtools index 218_9W_Pa2_mapped_sort.bam



 For read count: /home/yog/software/samtools-1.3.1/samtools idxstats 218_9W_Pa2_mapped_sort.bam > readcount_for_each_bam

Please, can you suggest how can I do all step for all sam files in a single command/scripts?

Thanks

rna-seq • 1.1k views
ADD COMMENTlink modified 21 months ago • written 21 months ago by Bioinfonext120
6
gravatar for Istvan Albert
21 months ago by
Istvan Albert ♦♦ 79k
University Park, USA
Istvan Albert ♦♦ 79k wrote:

You can add all your flags into a single command.

In addition, the latest samtools does not even need the -S flag as it detects the input type automatically.

samtools view -F 4 -b data.sam > data.bam

To run all of these commands on all SAM file you could automate with:

ls *.sam | xargs -n 1 -I {} sh -c 'samtools view -F 4 -b {} > {}.bam'

Annoyingly this will add another extension .sam.bam so you would need to also apply a batch rename (you can find many examples of that on StackOverflow).

A more elegant solution (and the recommended practice) would be to use GNU Parallel:

ls *.sam | parallel 'samtools view -F 4 -b {} > {.}.bam'
ADD COMMENTlink modified 21 months ago • written 21 months ago by Istvan Albert ♦♦ 79k

I removed flag -S from command as you have suggested.

ADD REPLYlink written 21 months ago by Bioinfonext120

I used this scripts and I able to extract mapped reads from sam file to in the bam format.

ls *.sam | xargs -n 1 -I {} sh -c 'samtools view -F 4 -b {} > {}.bam'

Now can you please suggest how to sort and index all these bam files?

ADD REPLYlink modified 21 months ago • written 21 months ago by Bioinfonext120
1

replace the text within the single quotes with the command you wish to execute.

In general, when someone helps you with an advice you need to make a concerted effort to understand what the content consists of and how it works - that way it is easy to generalize and apply to a different situation.

ADD REPLYlink modified 21 months ago • written 21 months ago by Istvan Albert ♦♦ 79k

Conceptually quite easy is a for loop

for f in *.bam
do
samtools index $f
done
ADD REPLYlink written 21 months ago by WouterDeCoster37k

to avoid .sam.bam you can do for j in *.sam ; do basename $j .sam ;done |sed ':a;N;$!ba;s/\n/ /g' sed is just removing the new lines by spaces

ADD REPLYlink written 10 months ago by CS10
0
gravatar for badribio
21 months ago by
badribio240
badribio240 wrote:

Look here this may help Using Samtools On Many Files Recursively In One Go

ADD COMMENTlink written 21 months ago by badribio240
2

I would not recommend this post as it references an old version of samtools and the commands listed might not work anymore.

ADD REPLYlink modified 21 months ago • written 21 months ago by Istvan Albert ♦♦ 79k

thank you for correcting ! I should have mentioned about version difference.

ADD REPLYlink written 21 months ago by badribio240
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 752 users visited in the last hour