Dear all,
I have a question about the different results for one same tumor sample between VarScan2 somatic and mpileup2snp. I used VarScan2 somatic to find somatic variants and germline variants from one pair of tumor-normal matched samples with the code below:
samtools mpileup -d 1000000 -q 30 -Q 30 -A -B -l Bed -f hg19Ref Tumor.Bam > tumor.mpileup
samtools mpileup -d 1000000 -q 1 -Q 15 -A -B -l Bed -f hg19Ref NormalBam > normal.mpileup
java -jar VarScan.v2.4.3.jar somatic normal.mpileup tumor.mpileup tumor --output-snp Paired.snv.vcf --min-coverage-normal 8 --min-coverage-tumor 100 --min-var-freq 0.01 --min-freq-for-hom 0.75 --normal-purity 1.00 --tumor-purity 1.00 --p-value 0.99 --somatic-p-value 0.01 --strand-filter 1 --validation 0 --output-vcf 1
Then I filtered the somatic and germline variants with some parameters( --min-coverage-tumor 100 --min-var-freq 0.01 --min-reads2 4)
grep -w "SS=1" Paired.snv.vcf |perl germline.filter.pl - > Paired.snv.germline.vcf
grep -w "SS=2" Paired.snv.vcf |perl somatic.filter.pl - > Paired.snv.somatic.vcf
cat Paired.snv.somatic.vcf Paired.snv.germline.vcf > Tumor.snv.paired.vcf
I also used VarScan2 mpileup2snp to find variants from the same tumor sample with the code below:
java -jar VarScan.v2.4.3.jar mpileup2snp tumor.mpileup --min-coverage 100 --min-var-freq 0.01 --min-avg-qual 15 --strand-filter 1 --output-vcf 1 --min-reads2 4 > Tumor.snv.single.vcf
Generally, the number of the variants in Tumor.snv.single.vcf is larger than that in Tumor.snv.paired.vcf; however, the result is opposite. There are some variants with all of the parameters above only found in Tumor.snv.paired.vcf, not in Tumor.snv.single.vcf. For example:
1 120611960 . C T . PASS DP=348;SS=1;SSC=31;GPV=1E0;SPV=7.46E-4 GT:GQ:DP:RD:AD:FREQ:DP4 0/1:.:82:72:10:12.2%:44,28,6,4 0/1:.:266:260:6:2.26%:156,104,4,2
This variant should be found by VarScan2 mpileup2snp, but it is not. Does anyone have any idea why it is ?
Thanks so much
Rui