Question: Deciding on a Variant Caller
2
gravatar for yusefbadi
6 months ago by
yusefbadi20
yusefbadi20 wrote:

How should I determine which is the best variant caller to use for a cancer mutations dataset? I'm working with about 70% average tumor purity so it's not great.

MuSE performs very well with a similar dataset but at 90%+ purity so I'm not sure how it will perform with this data. It seems MuSE outperforms MuTect2 generally but I'm still unsure...

It seems that tumor purity confounds the results so I'm leaning towards using Varscan as it circumvents this as it doesn't use probabilistic framework (like bayesian stats) to detect variants and assess confidence in them however it struggles with sensitivity and fails to pick up somatic SNVs of low allelic fraction so that's a major problem.

I would really appreciate some advice on what to look at when deciding what to use...

ADD COMMENTlink modified 6 months ago by d-cameron850 • written 6 months ago by yusefbadi20

MuSE outperforms MuTect2

Do you have a reference for this statement (I'm genuinely interested I don't mean to argue for or against it)

ADD REPLYlink written 6 months ago by dariober8.0k

https://genomebiology.biomedcentral.com/articles/10.1186/s13059-016-1029-6

This is a good comparison, maybe I'm over generalising but the context I'm working with this seems to be the case.

ADD REPLYlink written 6 months ago by yusefbadi20
2

Of course, that is a paper from the MuSE developers. Every variant caller that gets published claims to be better than all the previous ones.

ADD REPLYlink written 6 months ago by igor4.5k
1

It's pretty easy to outperform 5 other callers when you get to select the data set, the truth set, and the callers to compare against.

ADD REPLYlink written 6 months ago by d-cameron850

In fact I was hoping for a reference other than the authors' paper...

ADD REPLYlink written 6 months ago by dariober8.0k

good point, I will get back to you if I find anything worthwhile.

ADD REPLYlink written 6 months ago by yusefbadi20
3
gravatar for Chris Miller
6 months ago by
Chris Miller18k
Washington University in St. Louis, MO
Chris Miller18k wrote:

Any modern caller should be "good enough" for most high-VAF calls in relatively pure samples (70% counts as relatively pure, in my book). When you need low-VAF data or are worried about tricky regions, my preferred approach is to run several callers, merge the calls, then do some post-filtering.

See Figure 4 and the supplement of our paper here for a comparison on one very-deeply sequenced tumor: http://www.cell.com/cell-systems/abstract/S2405-4712(15)00113-1 It does not include newer callers like Muse or Mutect2, but does show that different callers have different strengths and weaknesses.

ADD COMMENTlink written 6 months ago by Chris Miller18k

Thanks, this is excellent advice and really useful. One question, what would you consider as high-VAF?

ADD REPLYlink written 6 months ago by yusefbadi20
3
gravatar for d-cameron
6 months ago by
d-cameron850
Australia
d-cameron850 wrote:

Have a look the results of the DREAM Somatic Mutation Calling Challenge [1]. There are a number of somatic-only callers that perform well on their benchmarks.

[1] http://dreamchallenges.org/project/icgc-tcga-dream-somatic-mutation-calling-challenge/

ADD COMMENTlink written 6 months ago by d-cameron850
2
gravatar for igor
6 months ago by
igor4.5k
United States
igor4.5k wrote:

Different variant callers will perform differently for different samples. You have to find one that works best for yours. This means calling variants with different callers and checking which ones can be successfully validated.

It sounds like you are basing your assessment based on tumor purity alone. It is a complicated measurement and it's often difficult to estimate the tumor fraction accurately. How confident are you in that 70% estimate? Additionally, tumors are heterogenous, so there is never actually a pure tumor. Tumor fraction aside, there are a lot of other factors, such as the quality of input DNA and sequencing depth, that will have a huge impact on your results.

ADD COMMENTlink modified 6 months ago • written 6 months ago by igor4.5k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 918 users visited in the last hour