Question

How to get "normal.pileup" file for varscan somatic calling

0

Entering edit mode

5.6 years ago

aadam35411 • 0

In order to perform somatic mutation calling with varscan, I need to run this line:

java -jar VarScan.jar somatic normal.pileup tumor.pileup output.basename

I was wondering if there is a place to get either a pileup version of hg19 or even a bam file of it in order to put it in the spot for "normal.pileup" as required by the command above. I'd rather download it because going through the process of creating such a file from the fastq file I have will take forever.

varscan • 1.6k views

ADD COMMENT • link updated 5.6 years ago by ATpoint 82k • written 5.6 years ago by aadam35411 • 0

score 0 · Answer 1 · 2018-09-02

0

Entering edit mode

5.6 years ago

ATpoint 82k

I think there are a few things that have to be clarified. Somatic variants are variants that arise in a certain tissue at a certain time point, but are not inherited by or from the parents via the germline, so oocyte or sperm. As a consequence, if one wants to perform somatic variant calling in e.g. a tumor biopsy, a normal control is required. That is most commonly a sequencing run from peripheral blood. In any case, it must come from the same patient. I wanted to point that out because you asking for any hg19 normal BAM will not help. It must be a sequencing experiment from the same source as your somatic sample. What kind of data do you have?

The command I use with the latest VarScan2 version is:

samtools mpileup -q 20 -Q 25 -B -d 1000 -f genome.fasta normal.bam tumor.bam | $varscan2 somatic /dev/stdin $outputname -mpileup --strand-filter 1 --output-vcf

This will create a combined tumor/normal pileup, piped into varscan, because there is not really a reason to save these pileups to disk I think.

ADD COMMENT • link 5.6 years ago by ATpoint 82k

0

Entering edit mode

Thank you for the explanation, that clears up a lot. Right now I'm working on learning the pipeline to go from the original Illumina sequencing reads (fastq) to a VCF file. I have the two original read files and a copy of hg19, and all the other files I have have been derived from these three files. I'll have to check in with my PI to clarify about getting the control sequencing file.

ADD REPLY • link 5.6 years ago by aadam35411 • 0

0

Entering edit mode

Is this a whole-genome sequencing experiment and what computational equipment do you have around (CPUs, RAM)?

ADD REPLY • link 5.6 years ago by ATpoint 82k