Losing somatic SNPs
4.7 years ago
gab ▴ 20

I got circa 130 high confidence somatic SNP, comparing tumor and blood samples with VarScan. I wanted to annotate them so I passed the data to VarAft, but it gives me just 6 variations. What am I doing wrong? Is there a tool that gives me the info i can get with VarAft?

SNP varscan varaft
Have you thought in annotating this file with other tool than VarAft. If it works for all your variants, the problem is with VarAft. It it works for your 6 variants only, the problem is in your input file.

It may be helpful to give us a sample of your somatic variants here.

chrom  position   ref  var  normal_reads1  normal_reads2  normal_var_freq  normal_gt  tumor_reads1  tumor_reads2  tumor_var_freq  tumor_gt  somatic_status  variant_p_value  somatic_p_value        tumor_reads1_plus  tumor_reads1_minus  tumor_reads2_plus  tumor_reads2_minus  normal_reads1_plus  normal_reads1_minus  normal_reads2_plus  normal_reads2_minus
1      1268181    T    C    36             0              0%               T          9             2             18,18%          Y         Somatic         1.0              0.050878815911191914   9                  0                   2                  0                   35                  1                    0                   0
1      17084510   G    A    148            7              4,52%            G          67            13            16,25%          R         Somatic         1.0              0.0031524490371238824  67                 0                   13                 0                   140                 8                    7                   0
1      26887584   C    A    19             0              0%               C          8             2             20%             M         Somatic         1.0              0.11083743842364542    8                  0                   2                  0                   19                  0                    0                   0
1      27873916   G    T    45             0              0%               G          9             2             18,18%          K         Somatic         1.0              0.035714285714286274   9                  0                   2                  0                   41                  4                    0                   0
1      45504720   C    G    52             3              5,45%            C          35            14            28,57%          S         Somatic         1.0              0.001473746031299      34                 1                   14                 0                   50                  2                    3                   0
1      57221553   T    G    12             0              0%               T          3             9             75%             G         Somatic         1.0              1.682595234890286E-4   3                  0                   9                  0                   11                  1                    0                   0
1      57221554   G    A    13             0              0%               G          3             9             75%             A         Somatic         1.0              1.076860950329785E-4   3                  0                   9                  0                   12                  1                    0                   0
1      92467622   C    T    8              0              0%               C          13            2             13,33%          Y         Somatic         1.0              0.4150197628458471     13                 0                   2                  0                   8                   0                    0                   0


This is some example of .snp output from VarScan. Annotating with VEP gives no problems.

Are these coordinates on GRCh37 or GRCh38? If the VEP runs fine with your input file, the chances are there is something going on with VarScan.

Can you compare the 6 positions that work with VarScan with the ones that do not work? What is different between this two groups?

It turned out bam files were generated with a different fasta instead of the one I used to get the mpileups, this lead to indexing problems and all SNPs were misplaced. Tahnks to everybody for the help!

Does the tool remove variants that are unlikely to be related to disease? If so, I would not be surprised that the majority of mutations got eliminated. Most somatic mutations have no causality towards the disease but are passengers.