Question: Difference Between GATK, Freebayes and SAMtools mpileup
1
gravatar for manojkumarbioinfo
3.8 years ago by
India
manojkumarbioinfo30 wrote:

Hi,

I'm new to the NGS data analysis. i am studying from the very basics of NGS how the data are analysed.

For aligning the Raw reads there many tools each one have different algorithm for example BWA works on BWT Novoalign works on Needleman wunch Mosaik works on Smith- Water man

what about the variant caller Most of the people are using GATK, SAMtools mpileup, and freebayes for the variant calling i have gone through their site but i cant find what algorithm are the using for detecting the variant where in freebayes they told that they have been using the bayesian. i want to know whether all the variant caller works on the bayesian algoritm or they are having different way of finding the variant using different algorithm

variant caller ngs • 3.7k views
ADD COMMENTlink modified 3.8 years ago • written 3.8 years ago by manojkumarbioinfo30

Most of the code for GATK is available on github.

ADD REPLYlink written 3.8 years ago by mastal5112.0k

Sorry but how is this answer relevant to the question? I'm moving it to a comment.

ADD REPLYlink written 3.8 years ago by RamRS25k
2
gravatar for manojkumarbioinfo
3.8 years ago by
India
manojkumarbioinfo30 wrote:

I found these are the major difference between GATK and Samtools

  1. Preprocess of alignment" GATK drops reads with low mapping quality , but samtools uses all reads by default.
  2. SNP genotype likelihood model: GATK assumes sequencing errors are independent while samtools believes the second error comes at a higher chance.Indel genotype likelihood model. This is one of the major differences between samtools and GATK. Samtools model is derived from BAQ. GATK's model was derived from Dindel's model.
  3. Samtools uses hand-tuned filters, while GATK learns filters from data. Of course GATK's approach is more convenient and powerful at least for human variant calling where you have enough data to train the model and you do not need non-polymorphic sites
  4. Other GATK specific features: haplotype caller, indel realignment among many others.
  5. Other samtools specific features: genotype-free analysis, physical phasing and so on.
ADD COMMENTlink written 3.8 years ago by manojkumarbioinfo30
1

Nice summary! I forgot to point you to this thread Difference Between Samtools And Gatk Algorithms, sorry :) Regarding INDELs, please keep in mind, that GATK can be easily fooled by lack of coverage and short deletion! This could be clearly seen, if you are calling variants in exon regions of close range.

ADD REPLYlink modified 3.8 years ago • written 3.8 years ago by H.Hasani810

Nice find. I remember reading it long ago , but I'd forgotten it existed:)

ADD REPLYlink written 3.8 years ago by RamRS25k

Good job - this is very informative. Thank you!

ADD REPLYlink written 3.8 years ago by RamRS25k
2
gravatar for H.Hasani
3.8 years ago by
H.Hasani810
Freiburg, Germany
H.Hasani810 wrote:

Hi,

as I'm using GATK and Samtools, I can recommend the following papers. However, I've never used freebayes:

Samtools: "A statistical framework for SNP calling, mutation discovery, association mapping and population genetical parameter estimation from sequencing data", PMID: 21903627

GATK: "The Genome Analysis Toolkit: a MapReduce framework for analyzing next-generation DNA sequencing data.", PMID: 20644199

Hth

ADD COMMENTlink written 3.8 years ago by H.Hasani810
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 758 users visited in the last hour