GATK 4 and Spark multithreading
0
0
Entering edit mode
4 weeks ago
Victoria • 0

I would like to how to use Spark within GATK for multi-threading analysis. Unfortunately, the Broad Institute website for its cluster-Spark tutorial documentation is still in progress. I am using HaplotypeCaller which has been working fine but now I have some pooled seq samples and they take much longer so would like to spread the workload. This is an example of my usage:

gatk HaplotypeCaller -I my_pooled_sample.bam -I another_pooled_sample.bam -L a_chromosome -R ref_genome.fna -O my_out_file.g.vcf -ploidy 10 -- --spark-master local[2]

I used the above sparks command from this example. But it didn't work. I checked the help info and got this:

>     gatk forwards commands to GATK and adds some sugar for submitting spark jobs
>      --spark-runner <target>    controls how spark tools are run
>          valid targets are:
>          LOCAL:      run using the in-memory spark runner
>          SPARK:      run using spark-submit on an existing cluster
>                      --spark-master must be specified
>                      --spark-submit-command may be specified to control the Spark submit command
>                      arguments to spark-submit may optionally be specified after --
>          GCS:        run using Google cloud dataproc
>                      commands after the -- will be passed to dataproc
>                      --cluster <your-cluster> must be specified after the --
>                      spark properties and some common spark-submit parameters will be translated
>                      to dataproc equivalents

I then tried using:

--spark-runner local[2]

Which also didn't work. I would appreciate some guidance. Many thanks.

multithreading haplotypecaller sparks gatk • 153 views
ADD COMMENT
0
Entering edit mode
ADD REPLY
0
Entering edit mode

I am sorry, I didn't realise that wasn't allowed, I have deleted the other post.

ADD REPLY

Login before adding your answer.

Traffic: 2567 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6