Passing multiple arguments to external programs in a Pipeline with Python
0
0
Entering edit mode
7.2 years ago
barslmn ★ 2.1k

Greetings,

I'm trying to build a pipeline for NGS.

I made a small example pipeline for passing commands to shell. Example pipeline has two scripts thats called from shell that just concatenatessumtool.py) and multipliesmultool.py) values in many dataframes (10 in this case). My wrapperwrapper.py) handles with the input and passes the commands that runs the scripts in order. Here is the relevant part of the code from the wrapper:

def run_cmd(orig_func):

    @wraps(orig_func)
    def wrapper(*args,**kwargs):
        cmdls = orig_func(*args,**kwargs)
        cmdc = ' '.join(str(arg) for arg in cmdls)
        cmd = cmdc.replace(',','')
        Popen(cmd,shell=True).wait()
    return wrapper

@run_cmd
def runsumtool(*args):
    return args

for file in getcsv():
    runsumtool('python3','sumtool.py','--infile={}'.format(file),'--outfile={}'.format(dirlist[1]))

This works alright but I want to be able to pass all the commands at once for the first script with all the dataframes wait for it to finish and then run the second script with all commands at once for every dataframe. Since Popen().wait() waits for each command it takes way longer.

I tried to incorporate luigi for a solution but I wasn't successful running external programs or trying to pass multiple I/O's with luigi. Any tip on that is appreciated.

Another solution I'm imagining is passing the samples individually all at once but I'm not sure how to put it in python(or any other language really). This would also solve the I/O problem with luigi.

thanks

Note1: This is a small example pipeline I build. My main purpose is to call programs like bwa, picard in a pipeline ... which i cannot import.

Note2: I'm using Popen from subprocess already. You can find it between lines 4 and 5.

next-gen pipeline python • 4.9k views
ADD COMMENT
0
Entering edit mode

This looks to me like an advanced python programming question better asked on SO. I am also not sure I quite understand the question but it looks like you could use a lightweight workflow management system like one of these.

ADD REPLY
0
Entering edit mode

I asked this on SO, no answers:/ . Thanks for the link it is helpful .

ADD REPLY
0
Entering edit mode

I run external programs in Python, using subprocess, without Popen. If you're going to use bwa and picard, I imagine you have fastq files, and if they're paired ends, you can use glob, to collected them into a tuple form a directory and then process them.

ADD REPLY

Login before adding your answer.

Traffic: 1649 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6