what's the difference between picard MergeBamAlignment and samtools merge
1
1
Entering edit mode
5.5 years ago
Yingzi Zhang ▴ 90

Dear all, I want to do GATK variant calling. I have several raw-paired-end-reads fastq files for one pig individual. And I have finished mapping using bwa mem. GATK best practices (for calling germline SNPs + indels) tells that after mapping, I need to merge bam files. And it recommends picard MergeBamAlignment. what's the difference between picard MergeBamAlignment and samtools merge please? I found the input files they need are different. The former one need unmapped bam file as well as the reference genome file. And in my understanding, each MergeBamAlignment command is only for one fastq (pair) file, not for multiple fastq files. Is there any other difference please? Particularly, are their aims, and their output results any different please? I feel that MergeBamAlignment is for combining the information both from mapped file and unmapped file. samtools merge is for combining diffferent mapped file into one big mapped file. Or Can I ignore GATK's advice and simply use samtools merge instead? Thank you.

Yingzi

software error alignment • 4.4k views
ADD COMMENT
2
Entering edit mode
5.5 years ago
ForeverFly ▴ 20

Hi Yingzi,so you are here!

I haven't used MergeBamAlignment(Picard) but I think the best way to get knowledge about a software is to read the original documentation.

As for your case, firstly, we can find the asscoiated description in the documentation of MergeBamAlignment(Picard) and the documentation reads:

A command-line tool for merging BAM/SAM alignment info from a third-party aligner with the data in an unmapped BAM file, producing a third BAM file that has alignment data (from the aligner) and all the remaining data from the unmapped BAM. Quick note: this is not a tool for taking multiple sam files and creating a bigger file by merging them. For that use-case, see {@link MergeSamFiles}.

Secondly, we can also find the documentation of samtools and we can find the function of merge:

Merge multiple sorted alignment files, producing a single sorted output file that contains all the input records and maintains the existing sort order.

As you can see,the two tools have different functions, so you need to use both of them. What's more, you can also merge multiple bam files using MergeSamFiles(Picard) instead of samtools. You can also find that the desctiption of MergeSamFiles(Picard):

Merges multiple SAM and/or BAM files into a single file. This tool is used for combining SAM and/or BAM files from different runs or read groups into a single file, similarl to the "merge" function of Samtools (http://www.htslib.org/doc/samtools.html).

So I can confidently to say that what you have understood is right.

Yours :)

Duo

ADD COMMENT
0
Entering edit mode

Thank you Duo. That makes sense. Thank you for your advice also.

ADD REPLY

Login before adding your answer.

Traffic: 2487 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6