Question

VarScan and Samtools - how to process my data?

2

Entering edit mode

8.9 years ago

ciemanek ▴ 140

Hello, In my project I am going to compare bioinformatic tools for tumor clonary evolution analysis (sciClone and others). I have results of paired-end Whole Exom Sequencing, three samples from one patient: Control, Primary and Relapse tumor. Therefore, first I have to detect somatic mutations and I am going to do that with VarScan. I don't quite understand though what processing pipeline should I choose from the step of creating .mpileup files with Samtools.

I've found some possibilieties, which includes:

-creating .mpileup files separately for normal and tumor samples, and then make both of them an input to VarScan following (in short):

samtools mpileup -f hg19.fa nomal.bam > normal.bam.mpileup
samtools mpileup -f hg19.fa tumor.bam > tumor.bam.mpileup
java -jar VarScan.jar somatic normal.bam.mpileup tumor.bam.mpileup --output-snp snp --output-indel indel

(source: http://www.ncbi.nlm.nih.gov/pmc/articles/PMC3971343/)

-creating .mpileup files for paired normal-tumor samples, and then give the result as the input to VarScan:

samtools mpileup –f reference.fasta normal.bam tumor.bam > normal-tumor.mpileup
java –jar VarScan.jar somatic normal-tumor.mpileup output.basename

(source: http://www.ncbi.nlm.nih.gov/pmc/articles/PMC4278659/)

My problem is, that I don't quite understand which way should I choose - could someone please explain me what is the difference in those schemes? I am completely freshman in NGS data analysis, therefore the simplest words, the better :) And just to be sure - can I treat my "Control" sample as normal sample?

I would appreciate any help

Kind regards, Agata

ngs sequencing varscan samtools pipeline • 7.1k views

ADD COMMENT • link updated 8.9 years ago by poisonAlien ★ 3.2k • written 8.9 years ago by ciemanek ▴ 140

score 2 · Answer 1 · 2016-07-29

2

Entering edit mode

8.9 years ago

poisonAlien ★ 3.2k

Both of them are fine. But, there is no need to write pileup to an output file. Its a waste of time and diskspace. You can just redirect the mpileup output to varscan input with pipe. Use -mpileup for somatic command.

Here:

> samtools mpileup -B -f ref.fa -q 15 -L 10000 -d 10000 normal.bam tumor.bam | java -Xmx16g -d64 -jar varscan.jar somatic -mpileup sampleName --min-coverage-normal 10 --min-coverage-tumor 14 --min-var-freq 0.02 --strand-filter 1

ADD COMMENT • link 8.9 years ago by poisonAlien ★ 3.2k

1

Entering edit mode

So does the term "-mpileup" instruct it to take the output of the mpileup before the pipe and use it at that specific place in the varscan command? I'm confused because I'm reading in other places that you can just use a single "-" when piping with SAMtools and it will know to take it as the output of the previous command. Would both "-" and "-mpileup" work in this case?