Varscan2: run in parallel / merge mpile output from diff run into single mpileup
1
1
Entering edit mode
9.9 years ago
Chirag Nepal ★ 2.4k

Hi all,

I am using varscan2 to identify SNPs in our data (exome study comparing normal/tumor from 5 patients).

For testing I used only two pairs,

samtools mpileup -f assembly.fa ST1_normal.bam ST1_tumor.bam ST2_normal.bam ST2_tumor.bam > myTestData.mpileup

It takes long time, is there a way how can I run in parrallel. Or if I run it in single pair (tumor/normal), can I simply concatenate all mpileup files into single output file, which I can input to varscan2 for SNP calling.

Thanks for your help !

Cheers
Chirag

exome SNP varcsan • 3.2k views
ADD COMMENT
1
Entering edit mode
9.9 years ago

Run one mpileup in parallel for each chromosome, region, using the option -l or -r

-l FILE     BED or position list file containing a list of regions or sites where pileup or BCF should be generated
-r STR  Only generate pileup in region STR
samtools mpileup -f assembly.fa  -r chr1 (...) >  myTestData1.mpileup
samtools mpileup -f assembly.fa  -r chr2 (...) >  myTestData2.mpileup
samtools mpileup -f assembly.fa  -r chr3 (...) >  myTestData2.mpileup
(...)
ADD COMMENT
0
Entering edit mode

Dear Pierre,

Thanks for the answer. So just to be clear i need to run separately for each chromosome

samtools mpileup -f assembly.fa  -r chr1 Nor_1.bam Tum_1.bam Nor_2.bam Tum_2.bam Nor_3.bam Tum_3.bam Nor_N.bam Tum_N.bam > mplie_N1.mpileup
samtools mpileup -f assembly.fa  -r chr2 Nor_1.bam Tum_1.bam Nor_2.bam Tum_2.bam Nor_3.bam Tum_3.bam Nor_N.bam Tum_N.bam > mplie_N2.mpileup

So to make these run in parallel, i need to submit the jobs separately in different bash scripts ? I guess if I put all these commands in a single bash script they will run sequentially. Right ? Though it might be faster when chr is separated.

Next, when we have this multiple mpileup results, can we simple concatenate in single file :

like cat mplie_N1.mpileup mplie_N2.mpileup > mplie_total.mpileup

Or does samtools have some functions to concat them.

thanks !

Cheers
Chirag

ADD REPLY
0
Entering edit mode

So to make these run in parallel, I need to submit the jobs separately in different bash scripts?

Use GNU parallel or GNU make with option -j How To Run Muscle In Batch?

ADD REPLY

Login before adding your answer.

Traffic: 2035 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6