Question: Calling somatic variants with Varscan
0
gravatar for ww22runner
6 months ago by
ww22runner0
ww22runner0 wrote:

Hello everyone,

I am trying to use Varscan to find somatic variants but I want to apply filters on the minimum variant frequency and minimum number of variant reads. I have tumor-normal paired samples.

In this case, should I use: a) mpileup2snp and mpileup2indel separately and indicate the filters or b) somatic and then somaticFilter on each one of the files produced (snps and indels)?

Is there a difference in how these two approaches work? Also, I want the output in vcf format and according to the manual, only mpileup2snp/mpileup2indel offes that option?

Thanks!

varscan • 320 views
ADD COMMENTlink written 6 months ago by ww22runner0

vcf format is available as output, if the right option is given. I usaed the second of the workflows you mentioned, and it worked pretty well.

ADD REPLYlink written 6 months ago by gab20

thank you for your input.

ADD REPLYlink written 6 months ago by ww22runner0
2
gravatar for ATpoint
6 months ago by
ATpoint15k
Germany
ATpoint15k wrote:

It is the second workflow that you want. I have a script in my Github that uses VarScan2 for somatic calling (not saying it is a nice one or fulfills any good-practice standard when it comes to style or whatever), but you might get some inspiration from it. It also uses GNU parallel to parallelize the process over all chromosomes.

It starts by calling the raw variants with VarScan2 somatic, then separates them into somatic and germline with processSomatic, selects high-confidence variants with on VarScan's Fisher's Exact Test, and finally runs the heuristic fpfilter to remove junk calls.

ADD COMMENTlink modified 6 months ago • written 6 months ago by ATpoint15k

Hello ATpoint,

Thank you for your input. I have also been doing something along the same lines (somatic > processSomatic) to look at somatic variations but was wondering how different the results may be if I took approach 1 instead. Would the results be mostly similar? I am trying to switch over from approach 1 to 2 and wanted find out how I can fairly assess if it is worth switching over. Any advice is appreciated!

Thank you!

ADD REPLYlink written 5 months ago by ww22runner0

I do not know any details of VarScan2 actually (and I switched to Strelka2 because it is still maintained), but given that a somatic mode exists, I would use it. Everything else will require custom code, which will require evaluation to test if the variants you get from this custom procedure are actually reliable. Pulling out the somatics would require any kind of test to check if the allele frequency/allelic count is significantly higher or lower than in the germline, which is pretty much what implemented in processSomatic via a (I think) Fisher's exact test. So why the effort if somatic callers exist? Still, if you get started with a new project, probably switching to a more recent tool might be a good idea.

ADD REPLYlink modified 5 months ago • written 5 months ago by ATpoint15k

That makes sense, thank you for your advice.

ADD REPLYlink written 5 months ago by ww22runner0
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1385 users visited in the last hour