Question: Calling somatic variants with Varscan
0
gravatar for ww22runner
16 months ago by
ww22runner0
ww22runner0 wrote:

Hello everyone,

I am trying to use Varscan to find somatic variants but I want to apply filters on the minimum variant frequency and minimum number of variant reads. I have tumor-normal paired samples.

In this case, should I use: a) mpileup2snp and mpileup2indel separately and indicate the filters or b) somatic and then somaticFilter on each one of the files produced (snps and indels)?

Is there a difference in how these two approaches work? Also, I want the output in vcf format and according to the manual, only mpileup2snp/mpileup2indel offes that option?

Thanks!

varscan • 585 views
ADD COMMENTlink written 16 months ago by ww22runner0

vcf format is available as output, if the right option is given. I usaed the second of the workflows you mentioned, and it worked pretty well.

ADD REPLYlink written 16 months ago by gab20

thank you for your input.

ADD REPLYlink written 16 months ago by ww22runner0
2
gravatar for ATpoint
16 months ago by
ATpoint30k
Germany
ATpoint30k wrote:

It is the second workflow that you want. I have a script in my Github that uses VarScan2 for somatic calling (not saying it is a nice one or fulfills any good-practice standard when it comes to style or whatever), but you might get some inspiration from it. It also uses GNU parallel to parallelize the process over all chromosomes.

It starts by calling the raw variants with VarScan2 somatic, then separates them into somatic and germline with processSomatic, selects high-confidence variants with on VarScan's Fisher's Exact Test, and finally runs the heuristic fpfilter to remove junk calls.

ADD COMMENTlink modified 16 months ago • written 16 months ago by ATpoint30k

Hello ATpoint,

Thank you for your input. I have also been doing something along the same lines (somatic > processSomatic) to look at somatic variations but was wondering how different the results may be if I took approach 1 instead. Would the results be mostly similar? I am trying to switch over from approach 1 to 2 and wanted find out how I can fairly assess if it is worth switching over. Any advice is appreciated!

Thank you!

ADD REPLYlink written 16 months ago by ww22runner0

I do not know any details of VarScan2 actually (and I switched to Strelka2 because it is still maintained), but given that a somatic mode exists, I would use it. Everything else will require custom code, which will require evaluation to test if the variants you get from this custom procedure are actually reliable. Pulling out the somatics would require any kind of test to check if the allele frequency/allelic count is significantly higher or lower than in the germline, which is pretty much what implemented in processSomatic via a (I think) Fisher's exact test. So why the effort if somatic callers exist? Still, if you get started with a new project, probably switching to a more recent tool might be a good idea.

ADD REPLYlink modified 16 months ago • written 16 months ago by ATpoint30k

That makes sense, thank you for your advice.

ADD REPLYlink written 16 months ago by ww22runner0
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 685 users visited in the last hour