Question: How to get "normal.pileup" file for varscan somatic calling
gravatar for aadam35411
15 months ago by
aadam354110 wrote:

In order to perform somatic mutation calling with varscan, I need to run this line:

java -jar VarScan.jar somatic normal.pileup tumor.pileup output.basename

I was wondering if there is a place to get either a pileup version of hg19 or even a bam file of it in order to put it in the spot for "normal.pileup" as required by the command above. I'd rather download it because going through the process of creating such a file from the fastq file I have will take forever.

varscan • 568 views
ADD COMMENTlink modified 15 months ago by ATpoint26k • written 15 months ago by aadam354110
gravatar for ATpoint
15 months ago by
ATpoint26k wrote:

I think there are a few things that have to be clarified. Somatic variants are variants that arise in a certain tissue at a certain time point, but are not inherited by or from the parents via the germline, so oocyte or sperm. As a consequence, if one wants to perform somatic variant calling in e.g. a tumor biopsy, a normal control is required. That is most commonly a sequencing run from peripheral blood. In any case, it must come from the same patient. I wanted to point that out because you asking for any hg19 normal BAM will not help. It must be a sequencing experiment from the same source as your somatic sample. What kind of data do you have?

The command I use with the latest VarScan2 version is:

samtools mpileup -q 20 -Q 25 -B -d 1000 -f genome.fasta normal.bam tumor.bam | $varscan2 somatic /dev/stdin $outputname -mpileup --strand-filter 1 --output-vcf

This will create a combined tumor/normal pileup, piped into varscan, because there is not really a reason to save these pileups to disk I think.

ADD COMMENTlink modified 15 months ago • written 15 months ago by ATpoint26k

Thank you for the explanation, that clears up a lot. Right now I'm working on learning the pipeline to go from the original Illumina sequencing reads (fastq) to a VCF file. I have the two original read files and a copy of hg19, and all the other files I have have been derived from these three files. I'll have to check in with my PI to clarify about getting the control sequencing file.

ADD REPLYlink written 15 months ago by aadam354110

Is this a whole-genome sequencing experiment and what computational equipment do you have around (CPUs, RAM)?

ADD REPLYlink modified 15 months ago • written 15 months ago by ATpoint26k
Please log in to add an answer.


Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1070 users visited in the last hour