Question: Converting Python Code To R
0
gravatar for ruchiksy
6.2 years ago by
ruchiksy50
Singapore
ruchiksy50 wrote:

Hello,

I am trying to convert the Python code of RDXplorer to R to make the pre-processing more easier and efficient.

Here is the code in Python:

#!/usr/bin/env python


import pileup as plp
import v6
import globals as glob
import file_utils as fu
import AnalyzeSequence as ans
import os
import sys
import rpy2.robjects as ro
import rpy2.robjects.numpy2ri
import rdxplorer_api as rdxp

if int(len(sys.argv)) < 12:
    rdxp.usage()

else:
    path2bam=sys.argv[1]
    reference=sys.argv[2]
    wrkgdir=sys.argv[3]
    chromOfInterest=sys.argv[4]

    gender=sys.argv[5]
    hg=sys.argv[6]
    winSize=sys.argv[7]
    baseCopy=sys.argv[8]
    filter=sys.argv[9]
    sumWithZero=sys.argv[10]
    debug=sys.argv[11]
    delete=sys.argv[12]

    debug=fu.str2bool(debug)
    delete=fu.str2bool(delete)
    sumWithZero=fu.str2bool(sumWithZero)
    baseCopy=int(baseCopy)
    winSize=int(winSize)
    filter=int(filter)

    if rdxp.complainAndBail() == True:
        if debug==True:
            print("The following arguments have been accepted:")
            a=0
            for arg in sys.argv:
                if a==0:
                    print("Program: " + arg)
                elif a==1:
                    print("Bam file name: " + arg)
                else:
                    print ('\t' + arg)
                a=a+1

        accepted_chromosomes = ["1", "2", "3", "4", "5", "6", "7", "8", "9", "10", "11", "12", "13", "14", "15", "16", "17", "18", "19", "20","21","22", "X", "Y"]
        if fu.find_first_index(accepted_chromosomes, chromOfInterest) < 0:
            chromOfInterest = 'All'

I have not done any programming in R or Python, so I am wondering how should I go about starting the conversion?

Any tips/help would be mcuh appreciated.

Thanks

python code conversion R • 13k views
ADD COMMENTlink modified 6.2 years ago by Michael Dondrup46k • written 6.2 years ago by ruchiksy50
6

You don't know either language. But think in R it would be more efficient? I would close this question.

ADD REPLYlink written 6.2 years ago by Ido Tamir5.0k
1

Except for huge vectors and matrices, R is far slower than python.

ADD REPLYlink written 6.2 years ago by lh331k
1

Programming in R is a mess, compared with Python (which is a mess in its own Pythonic way). What problem are you really trying to solve?

ADD REPLYlink written 6.2 years ago by Alex Reynolds28k
12
gravatar for Istvan Albert
6.2 years ago by
Istvan Albert ♦♦ 80k
University Park, USA
Istvan Albert ♦♦ 80k wrote:

Well realistically the first step would be to learn R.

After all if you havent' done any programming with that language what good would it do to convert it to R?

There is a lot more to data analysis than having a single converted script.

ADD COMMENTlink written 6.2 years ago by Istvan Albert ♦♦ 80k
7
gravatar for KCC
6.2 years ago by
KCC3.9k
Cambridge, MA
KCC3.9k wrote:

Taking a quick look at this code and the website for RDXplorer, I think you are not correct in your assumption that the pre-processing would be faster in R. Here is my anecdotal argument: this code is hybrid R/Python code. In the code you shared, it imports this module:

import rpy2.robjects

The package rpy2 is an adaptation for using R code in python. So, some thought was given by the original programmers as to what to implement in python and what to implement in R. I think a good rule of thumb is you should at least match the level of sophistication of the original programmers, before trying to second guess them. I hope that doesn't sound too condescending, but you did mention that you didn't know either R or python. So, I hope it seems fair to you that I would question whether you can make the right decisions about what parts were best implemented in R vs. python. (I don't feel I could second guess them either. I would need to spend a few weeks reading through their code until I felt like I knew how all the parts worked and could make general guesses about why certain parts were implemented in R vs python.)

Why did you pick this script in particular? It seems like there must be a lot of python code in this package. The website lists that they also use scipy and numpy.

I should say that for simple scripts often you can do a line, by line conversion. For reasonably complicated programs (which this one seems to be), you would probably need to re-design the whole program, probably putting in a similar level of effort to what it took to write the original program in the first place.

ADD COMMENTlink modified 6.2 years ago • written 6.2 years ago by KCC3.9k
4
gravatar for Michael Dondrup
6.2 years ago by
Bergen, Norway
Michael Dondrup46k wrote:

I don't see the computation in this script, it is mainly parameter handling. In fact it seems like the part of your script that does something non-trivial is missing. So, as your script does nothing, that is very simple to convert to R ;)

ADD COMMENTlink written 6.2 years ago by Michael Dondrup46k
1

Hmmm. I assumed that the packages were not loaded for no reason and that ruchiksy just hadn't pasted the whole script. But I guess you right, this could be translated to R relatively easily ie. complain about parameters if they are not the right number, otherwise store some parameters and do nothing with them. Good catch! +1

ADD REPLYlink written 6.2 years ago by KCC3.9k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1818 users visited in the last hour