Question: what's the difference between picard MergeBamAlignment and samtools merge
0
gravatar for Yingzi Zhang
11 months ago by
Yingzi Zhang60
Beijing
Yingzi Zhang60 wrote:

Dear all, I want to do GATK variant calling. I have several raw-paired-end-reads fastq files for one pig individual. And I have finished mapping using bwa mem. GATK best practices (for calling germline SNPs + indels) tells that after mapping, I need to merge bam files. And it recommends picard MergeBamAlignment. what's the difference between picard MergeBamAlignment and samtools merge please? I found the input files they need are different. The former one need unmapped bam file as well as the reference genome file. And in my understanding, each MergeBamAlignment command is only for one fastq (pair) file, not for multiple fastq files. Is there any other difference please? Particularly, are their aims, and their output results any different please? I feel that MergeBamAlignment is for combining the information both from mapped file and unmapped file. samtools merge is for combining diffferent mapped file into one big mapped file. Or Can I ignore GATK's advice and simply use samtools merge instead? Thank you.

Yingzi

alignment software error • 550 views
ADD COMMENTlink modified 11 months ago by ForeverFly20 • written 11 months ago by Yingzi Zhang60

I would avoid any Broad Institute/GATK/Picard software whenever possible. These are elaborate tools, but in my experience overly picky and therefore often a pain to use plus error messages are typically rather uninformative. I personally do most of the standard tasks (BAM manipulation, sorting, indexing, merging) with sambamba or samtools. Merging your files with samtools merge should be fine.

ADD REPLYlink written 11 months ago by ATpoint23k
2
gravatar for ForeverFly
11 months ago by
ForeverFly20
Shenzhen
ForeverFly20 wrote:

Hi Yingzi,so you are here!

I haven't used MergeBamAlignment(Picard) but I think the best way to get knowledge about a software is to read the original documentation.

As for your case, firstly, we can find the asscoiated description in the documentation of MergeBamAlignment(Picard) and the documentation reads:

A command-line tool for merging BAM/SAM alignment info from a third-party aligner with the data in an unmapped BAM file, producing a third BAM file that has alignment data (from the aligner) and all the remaining data from the unmapped BAM. Quick note: this is not a tool for taking multiple sam files and creating a bigger file by merging them. For that use-case, see {@link MergeSamFiles}.

Secondly, we can also find the documentation of samtools and we can find the function of merge:

Merge multiple sorted alignment files, producing a single sorted output file that contains all the input records and maintains the existing sort order.

As you can see,the two tools have different functions, so you need to use both of them. What's more, you can also merge multiple bam files using MergeSamFiles(Picard) instead of samtools. You can also find that the desctiption of MergeSamFiles(Picard):

Merges multiple SAM and/or BAM files into a single file. This tool is used for combining SAM and/or BAM files from different runs or read groups into a single file, similarl to the "merge" function of Samtools (http://www.htslib.org/doc/samtools.html).

So I can confidently to say that what you have understood is right.

Yours :)

Duo

ADD COMMENTlink modified 11 months ago • written 11 months ago by ForeverFly20

Thank you Duo. That makes sense. Thank you for your advice also.

ADD REPLYlink written 11 months ago by Yingzi Zhang60
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1606 users visited in the last hour