Question: Variant Calling pipeline to be run parallel on multiple cores
0
gravatar for AR
5 days ago by
AR0
INDIA
AR0 wrote:

Hi all,

I have a variant calling pipeline containing multiple steps from different tools. Currently, I am working on a Cray system. I have already run the commands individually on single sample. Now, I want to go for multiple samples (24 samples at a time). I want to run my complete pipeline of variant calling on the high-end server using the MPI module. My commands are in the python script and I want to modify it for mpi4py. Just an ex:

When run individually:

import os
os.system("command 1")

But if running all together for multiple commands on multiple cores

from mpi4py import MPI
import os

Sample = ["1","2","3"]

for a in Sample:
    os.system("command1..input="+a", output="+a+"_1") 
    os.system("command2..input="+a+"_1, output="+a+"_2")
    os.system("command3..input="+a+"_2, output="+a+"_3")
    os.system("command4..input="+a+"_3, output="+a+"_4")
    os.system("command5..input="+a+"_4, output="+a+"_5")

comm = MPI.COMM_WORLD 
rank = comm.Get_rank()

This script is not working at all.

Can anyone pls help me. I just want to run my python script with import os on multiple processors at a time. (20 samples on 20 cores) And I have to use only MPI module.

Thank you

sequencing • 133 views
ADD COMMENTlink modified 5 days ago by RamRS30k • written 5 days ago by AR0
5

My commands are in the python script

please don't. Use a workflow manager like nextflow or snakemake.

ADD REPLYlink written 5 days ago by Pierre Lindenbaum130k

Agree with what Pierre Lindenbaum said

ADD REPLYlink modified 5 days ago • written 5 days ago by lakhujanivijay5.2k
1

Using GNU parallel instead

psuedo code

import subprocess

cmd_file_name = cmd_file.txt     
cmd_file = open(cmd_file_name, "a")

jobs = 20 

for i in your_sample_list :
  cmd = " ".join( [  "your command", "-i" , i ] )
  cmd_file.write(cmd, "\n")

cmd_file.close()

parallel_cmd = " ".join( [ "parallel", "--eta", "-j", jobs, "<", cmd_file_name ] )

subprocess.run(parallel_cmd, shell = True)
ADD REPLYlink modified 5 days ago by RamRS30k • written 5 days ago by lakhujanivijay5.2k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1059 users visited in the last hour