Question: Deciding on a Variant Caller
1
gravatar for yusefbadi
5 weeks ago by
yusefbadi10
yusefbadi10 wrote:

How should I determine which is the best variant caller to use for a cancer mutations dataset? I'm working with about 70% average tumor purity so it's not great.

MuSE performs very well with a similar dataset but at 90%+ purity so I'm not sure how it will perform with this data. It seems MuSE outperforms MuTect2 generally but I'm still unsure...

It seems that tumor purity confounds the results so I'm leaning towards using Varscan as it circumvents this as it doesn't use probabilistic framework (like bayesian stats) to detect variants and assess confidence in them however it struggles with sensitivity and fails to pick up somatic SNVs of low allelic fraction so that's a major problem.

I would really appreciate some advice on what to look at when deciding what to use...

ADD COMMENTlink modified 5 weeks ago by d-cameron570 • written 5 weeks ago by yusefbadi10

MuSE outperforms MuTect2

Do you have a reference for this statement (I'm genuinely interested I don't mean to argue for or against it)

ADD REPLYlink written 5 weeks ago by dariober7.5k

https://genomebiology.biomedcentral.com/articles/10.1186/s13059-016-1029-6

This is a good comparison, maybe I'm over generalising but the context I'm working with this seems to be the case.

ADD REPLYlink written 5 weeks ago by yusefbadi10
2

Of course, that is a paper from the MuSE developers. Every variant caller that gets published claims to be better than all the previous ones.

ADD REPLYlink written 5 weeks ago by igor3.8k
1

It's pretty easy to outperform 5 other callers when you get to select the data set, the truth set, and the callers to compare against.

ADD REPLYlink written 5 weeks ago by d-cameron570

In fact I was hoping for a reference other than the authors' paper...

ADD REPLYlink written 5 weeks ago by dariober7.5k

good point, I will get back to you if I find anything worthwhile.

ADD REPLYlink written 5 weeks ago by yusefbadi10
3
gravatar for Chris Miller
5 weeks ago by
Chris Miller18k
Washington University in St. Louis, MO
Chris Miller18k wrote:

Any modern caller should be "good enough" for most high-VAF calls in relatively pure samples (70% counts as relatively pure, in my book). When you need low-VAF data or are worried about tricky regions, my preferred approach is to run several callers, merge the calls, then do some post-filtering.

See Figure 4 and the supplement of our paper here for a comparison on one very-deeply sequenced tumor: http://www.cell.com/cell-systems/abstract/S2405-4712(15)00113-1 It does not include newer callers like Muse or Mutect2, but does show that different callers have different strengths and weaknesses.

ADD COMMENTlink written 5 weeks ago by Chris Miller18k

Thanks, this is excellent advice and really useful. One question, what would you consider as high-VAF?

ADD REPLYlink written 5 weeks ago by yusefbadi10
3
gravatar for d-cameron
5 weeks ago by
d-cameron570
Australia
d-cameron570 wrote:

Have a look the results of the DREAM Somatic Mutation Calling Challenge [1]. There are a number of somatic-only callers that perform well on their benchmarks.

[1] http://dreamchallenges.org/project/icgc-tcga-dream-somatic-mutation-calling-challenge/

ADD COMMENTlink written 5 weeks ago by d-cameron570
2
gravatar for igor
5 weeks ago by
igor3.8k
United States
igor3.8k wrote:

Different variant callers will perform differently for different samples. You have to find one that works best for yours. This means calling variants with different callers and checking which ones can be successfully validated.

It sounds like you are basing your assessment based on tumor purity alone. It is a complicated measurement and it's often difficult to estimate the tumor fraction accurately. How confident are you in that 70% estimate? Additionally, tumors are heterogenous, so there is never actually a pure tumor. Tumor fraction aside, there are a lot of other factors, such as the quality of input DNA and sequencing depth, that will have a huge impact on your results.

ADD COMMENTlink modified 5 weeks ago • written 5 weeks ago by igor3.8k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 489 users visited in the last hour