Question: Multiple sam to bam followed by raw read count
0
gravatar for Bioinfonext
2.1 years ago by
Bioinfonext150
Korea
Bioinfonext150 wrote:

I have multiple sam file like this:

sam file location:

/data/SNU_work/Analysis/mapped/mapped

samtool location:

/home/yog/software/samtools-1.3.1/samtools

218_9W_Pa2.sam

218_9W_Pa1.sam

216_7W_Co1.sam

216_7W_Ca2.sam

216_7W_Ca1.sam

I converting them one by one using below commnad:

/home/yog/software/samtools-1.3.1/samtool view  -b 218_9W_Pa2.sam > 218_9W_Pa2.bam

After that I am extracting mapped read from bam file and sorting of bam using below cammnd:

 /home/yog/software/samtools-1.3.1/samtools view -b -F4 218_9W_Pa2.bam > 218_9W_Pa2.mapped.bam


/home/yog/software/samtools-1.3.1/samtools sort 218_9W_Pa2.mapped.bam -o 218_9W_Pa2_mapped_sort.bam

After sorting bam I also want to do index bam file and follwed by raw read count:

  For indexing: /home/yog/software/samtools-1.3.1/samtools index 218_9W_Pa2_mapped_sort.bam



 For read count: /home/yog/software/samtools-1.3.1/samtools idxstats 218_9W_Pa2_mapped_sort.bam > readcount_for_each_bam

Please, can you suggest how can I do all step for all sam files in a single command/scripts?

Thanks

rna-seq • 1.3k views
ADD COMMENTlink modified 2.1 years ago • written 2.1 years ago by Bioinfonext150
6
gravatar for Istvan Albert
2.1 years ago by
Istvan Albert ♦♦ 80k
University Park, USA
Istvan Albert ♦♦ 80k wrote:

You can add all your flags into a single command.

In addition, the latest samtools does not even need the -S flag as it detects the input type automatically.

samtools view -F 4 -b data.sam > data.bam

To run all of these commands on all SAM file you could automate with:

ls *.sam | xargs -n 1 -I {} sh -c 'samtools view -F 4 -b {} > {}.bam'

Annoyingly this will add another extension .sam.bam so you would need to also apply a batch rename (you can find many examples of that on StackOverflow).

A more elegant solution (and the recommended practice) would be to use GNU Parallel:

ls *.sam | parallel 'samtools view -F 4 -b {} > {.}.bam'
ADD COMMENTlink modified 2.1 years ago • written 2.1 years ago by Istvan Albert ♦♦ 80k

I removed flag -S from command as you have suggested.

ADD REPLYlink written 2.1 years ago by Bioinfonext150

I used this scripts and I able to extract mapped reads from sam file to in the bam format.

ls *.sam | xargs -n 1 -I {} sh -c 'samtools view -F 4 -b {} > {}.bam'

Now can you please suggest how to sort and index all these bam files?

ADD REPLYlink modified 2.1 years ago • written 2.1 years ago by Bioinfonext150
1

replace the text within the single quotes with the command you wish to execute.

In general, when someone helps you with an advice you need to make a concerted effort to understand what the content consists of and how it works - that way it is easy to generalize and apply to a different situation.

ADD REPLYlink modified 2.1 years ago • written 2.1 years ago by Istvan Albert ♦♦ 80k

Conceptually quite easy is a for loop

for f in *.bam
do
samtools index $f
done
ADD REPLYlink written 2.1 years ago by WouterDeCoster39k

to avoid .sam.bam you can do for j in *.sam ; do basename $j .sam ;done |sed ':a;N;$!ba;s/\n/ /g' sed is just removing the new lines by spaces

ADD REPLYlink written 14 months ago by CS10
0
gravatar for badribio
2.1 years ago by
badribio240
badribio240 wrote:

Look here this may help Using Samtools On Many Files Recursively In One Go

ADD COMMENTlink written 2.1 years ago by badribio240
2

I would not recommend this post as it references an old version of samtools and the commands listed might not work anymore.

ADD REPLYlink modified 2.1 years ago • written 2.1 years ago by Istvan Albert ♦♦ 80k

thank you for correcting ! I should have mentioned about version difference.

ADD REPLYlink written 2.1 years ago by badribio240
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 968 users visited in the last hour