Calling somatic variants with Varscan
1
1
Entering edit mode
3.2 years ago
ww22runner ▴ 20

Hello everyone,

I am trying to use Varscan to find somatic variants but I want to apply filters on the minimum variant frequency and minimum number of variant reads. I have tumor-normal paired samples.

In this case, should I use: a) mpileup2snp and mpileup2indel separately and indicate the filters or b) somatic and then somaticFilter on each one of the files produced (snps and indels)?

Is there a difference in how these two approaches work? Also, I want the output in vcf format and according to the manual, only mpileup2snp/mpileup2indel offes that option?

Thanks!

varscan • 1.1k views
ADD COMMENT
0
Entering edit mode

vcf format is available as output, if the right option is given. I usaed the second of the workflows you mentioned, and it worked pretty well.

ADD REPLY
0
Entering edit mode

thank you for your input.

ADD REPLY
2
Entering edit mode
3.1 years ago
ATpoint 55k

It is the second workflow that you want. I have a script in my Github that uses VarScan2 for somatic calling (not saying it is a nice one or fulfills any good-practice standard when it comes to style or whatever), but you might get some inspiration from it. It also uses GNU parallel to parallelize the process over all chromosomes.

It starts by calling the raw variants with VarScan2 somatic, then separates them into somatic and germline with processSomatic, selects high-confidence variants with on VarScan's Fisher's Exact Test, and finally runs the heuristic fpfilter to remove junk calls.

ADD COMMENT
0
Entering edit mode

Hello ATpoint,

Thank you for your input. I have also been doing something along the same lines (somatic > processSomatic) to look at somatic variations but was wondering how different the results may be if I took approach 1 instead. Would the results be mostly similar? I am trying to switch over from approach 1 to 2 and wanted find out how I can fairly assess if it is worth switching over. Any advice is appreciated!

Thank you!

ADD REPLY
0
Entering edit mode

I do not know any details of VarScan2 actually (and I switched to Strelka2 because it is still maintained), but given that a somatic mode exists, I would use it. Everything else will require custom code, which will require evaluation to test if the variants you get from this custom procedure are actually reliable. Pulling out the somatics would require any kind of test to check if the allele frequency/allelic count is significantly higher or lower than in the germline, which is pretty much what implemented in processSomatic via a (I think) Fisher's exact test. So why the effort if somatic callers exist? Still, if you get started with a new project, probably switching to a more recent tool might be a good idea.

ADD REPLY
0
Entering edit mode

That makes sense, thank you for your advice.

ADD REPLY

Login before adding your answer.

Traffic: 1710 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6