Merge multiple Sam files to one big Sam file. How?
2
1
Entering edit mode
5.5 years ago
jaqx008 ▴ 110

Hello everyone, I have a folder that contains among other files, a list of sam files that was split previously by someone else. I want to use the same file but in the un-split form so I am looking for a way to:

  1. separate them out of the folder using grep because they are many (what grep command please?)
  2. Merge them into one Sam file.

Thanks

Sam samtools overlapZscore heatmap • 4.0k views
ADD COMMENT
2
Entering edit mode
5.5 years ago
Medhat 9.7k

From sam to sorted bam: for f in *.sam; do filename="${f%%.*}"; samtools view -bS $f | samtools sort -@ 4 - ${filename}.sorted.bam; done Then you can merge them:

samtools merge out.bam in1.bam in2.bam in3.bam

So you can use:

samtools merge out.bam `ls *sorted.bam`

Source: http://www.htslib.org/doc/samtools.html

Note:

-@ for number of threads.
Instead of for loop you can use parallel; refer to this example:
A: Run Samtools on multiple files

ADD COMMENT
0
Entering edit mode

the thing is I have hundreds of sam in the folder. it will take forever to do this one after the other to generate all bams before I merge them

ADD REPLY
0
Entering edit mode

Then use parallel as suggested above

If you want use sam directly use MergeSamFiles from Picard tools:

java -jar picard.jar MergeSamFiles \ 
      I=input_1.bam \
      I=input_2.bam \
      O=output_merged_files.bam

https://software.broadinstitute.org/gatk/documentation/tooldocs/4.0.0.0/picard_sam_MergeSamFiles.php

ADD REPLY
0
Entering edit mode

That's what screen or tmux are for. Coupled with parallel that should do the trick.

ADD REPLY
0
Entering edit mode

Im not an expert in this field, so some of the terms are strange to me. However, I will try to install picard and try the command. Thanks

ADD REPLY

Login before adding your answer.

Traffic: 1643 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6