Question: (Closed) VarScan somatic mutation calling
0
gravatar for jiagehao
5.8 years ago by
jiagehao10
European Union
jiagehao10 wrote:

Dear my friends,

   I used Star mapping and VarScan2 to call somatic mutations from 77 pairs of lung cancer and normal tissues with the command lines below.

STAR_2.3.0e.Linux_x86_64/STAR  --genomeDir genomedir/  --runThreadN 4 --readFilesIn /home/jli/err/ERR164578_1.fastq /home/jli/err/ERR164578_2.fastq --outFileNamePrefix /home/jli/Lung_cancer/ERR164578. --outSAMstrandField intronMotif

samtools mpileup -f genomedir/Homo_sapiens_assembly19.fasta Lung_cancer/ERR164578.bam > Lung_cancer/ERR164578.pileup

   The generation of mpileup files was very slow, it took almost 10 hours to finish producing pileup file for one sample. What 's worse, when I used the following command to detect somatic variants from a pair of normal and cancer samples.

java -jar VarScan.v2.3.7.jar somatic Lung_cancer/ERR164493.pileup Lung_cancer/ERR164578.pileup Lung_cancer/ERR164578_VarScan.snp --output-vcf 1

Varscan took more than 10 hours to call variants and detected huge number of germline snps. 

chr15    77154793    .    N    C    .    PASS    DP=62;SS=1;SSC=0;GPV=6.4572E-34;SPV=1E0    GT:GQ:DP:RD:AD:FREQ:DP4    1/1:.:37:0:36:100%:0,0,16,20    1/1:.:25:0:21:100%:0,0,11,10
chr15    77154794    .    N    T    .    PASS    DP=62;SS=1;SSC=0;GPV=2.5602E-33;SPV=1E0    GT:GQ:DP:RD:AD:FREQ:DP4    1/1:.:37:0:36:100%:0,0,16,20    1/1:.:25:0:20:100%:0,0,10,10
chr15    77154795    .    N    A    .    PASS    DP=62;SS=1;SSC=0;GPV=2.5602E-33;SPV=1E0    GT:GQ:DP:RD:AD:FREQ:DP4    1/1:.:37:0:36:100%:0,0,16,20    1/1:.:25:0:20:100%:0,0,11,9
chr15    77154796    .    N    T    .    PASS    DP=62;SS=1;SSC=0;GPV=4.0229E-32;SPV=1E0    GT:GQ:DP:RD:AD:FREQ:DP4    1/1:.:37:0:36:100%:0,0,16,20    1/1:.:25:0:18:100%:0,0,10,8
chr15    77154797    .    N    T    .    PASS    DP=63;SS=1;SSC=0;GPV=6.4572E-34;SPV=1E0    GT:GQ:DP:RD:AD:FREQ:DP4    1/1:.:37:0:36:100%:0,0,16,20    1/1:.:26:0:21:100%:0,0,10,11
chr15    77154798    .    N    A    .    PASS    DP=61;SS=1;SSC=0;GPV=1.5943E-31;SPV=1E0    GT:GQ:DP:RD:AD:FREQ:DP4    1/1:.:37:0:37:100%:0,0,16,21    1/1:.:24:0:16:100%:0,0,8,8
chr15    77154799    .    N    G    .    PASS    DP=60;SS=1;SSC=0;GPV=4.0229E-32;SPV=1E0    GT:GQ:DP:RD:AD:FREQ:DP4    1/1:.:37:0:37:100%:0,0,16,21    1/1:.:23:0:17:100%:0,0,8,9

 You could see the reference of each variant is always N,  I did the same using bam files generated by tophat mapping, the whole process was much faster, moreover, the number of snps called was more reasonable. did anyone have the same problems before?  Any suggestion will be appreciated.

 

rna-seq • 2.4k views
ADD COMMENTlink written 5.8 years ago by jiagehao10

Hello jiagehao!

Questions similar to yours can already be found at:

We have closed your question to allow us to keep similar content in the same thread.

If you disagree with this please tell us why in a reply below. We'll be happy to talk about it.

Cheers!

PS: Very unlikely to be related to VarScan. Very likely to be an issue with how you generated your pileups.
ADD REPLYlink written 5.8 years ago by Daniel Swan13k
1

Thank you so much, I read the post "Mpileup File From Samtools", it did solve my problem after using the uniform human genome to generate the pileup file.

ADD REPLYlink written 5.8 years ago by jiagehao10
Please log in to add an answer.
The thread is closed. No new answers may be added.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1600 users visited in the last hour