Question: Best practice for serial variant calling
0
gravatar for graeme.thorn
7 weeks ago by
graeme.thorn40
London, United Kingdom
graeme.thorn40 wrote:

I'm wondering what the best procedure for calling serial plasma samples from the same patients with a single normal sample would be.

For instance, running the samtools-mpileup-varscan2 pipeline with the normal sample first and the serial samples after gives genotype calls of 0/0 when I'd expect a variant to be called, such as here:

chr1    1471992 .   T   C   .   PASS    ADP=14;WT=2;HET=2;HOM=0;NC=0    GT:GQ:SDP:DP:RD:AD:FREQ:PVAL:RBQ:ABQ:RDF:RDR:ADF:ADR    0/1:23:23:23:16:7:30.43%:4.5803E-3:50:38:11:5:2:5   0/1:21:13:13:7:6:46.15%:7.4534E-3:34:35:2:5:1:5 0/0:3:10:10:6:4:40%:4.3344E-2:35:30:2:4:0:4 0/0:6:13:13:9:4:30.77%:4.7826E-2:30:30:5:4:1:3

in the fourth column (3rd serial plasma sample), when the read statistics are very similar to that in the second (1st serial sample) where the genotype has been called as 0/1.

Is this the best way of calling variants on multiple samples, or is it better to do normal/p1, normal/p2, normal/p3 etc, and then merge the variant sets at the end?

variants • 104 views
ADD COMMENTlink written 7 weeks ago by graeme.thorn40

not sure about "best practice" but I generally run all the variant calling per-sample or per-pair in parallel, then convert to .tsv with GATK VariantsToTable, add sample labels to the .tsv, then concatenate the .tsvs into a single table for review. If you have tumor-normal pairs then be sure to use variant callers that support that, I use MuTect2 and LoFreq Somatic for that right now but there are plenty others. If you are asking about the technical aspects of how to run them in parrallel then you would want either something basic like GNU parallel or a workflow manager like Snakemake or Nextflow.

ADD REPLYlink modified 7 weeks ago • written 7 weeks ago by steve2.2k

It was more whether to run through varscan (or equivalent) all at once, leading to calls I think are incorrect like the one above, or whether to run the (single) normal v each serial plasma sample in pairs, so N v P1, N v P2, N v P3, N v P4 etc... then join the variants together as you suggest.

ADD REPLYlink written 7 weeks ago by graeme.thorn40

if your plasma samples were collected independently then I think you would definitely want to run the variant calling independently for each tumor-normal pair. They would be considered separate biological samples.

ADD REPLYlink written 7 weeks ago by steve2.2k

It was more whether to run through varscan (or equivalent) all at once, leading to calls I think are incorrect like the one above, or whether to run the (single) normal v each serial plasma sample in pairs, so N v P1, N v P2, N v P3, N v P4 etc... then join the variants together as you suggest.

ADD REPLYlink written 7 weeks ago by graeme.thorn40
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1689 users visited in the last hour