Looping over a function in R, and needing a variable name to save output as unique files
3
0
Entering edit mode
3 months ago
YanO ▴ 140

I have data from three experiments. I want to import the data into R, perform some commands to clean/sort/transform the data etc., then plot a figure and save as a pdf. I need to do this for each of the three experiments. I know enough to know that copy-pasting the code 3 times is a bad idea. I assume the best way to go is to write a function for my cleaning/sorting/transforming and plotting, and then write a loop to call this function 3 times for each of my experiments.

  1. What are the 'best practices' for handling a situation like this in R - for calling the same code for multiple experiments?

  2. How do I define the name of my experiment as a variable, so that I can incorporate it into the name of the saved output file?

For example, if I write a function:

analyze <- function(experiment, genes) {
    # other commands here...
    # then plot and save:
    pdf("experiment_and_genes_plot.pdf")
    my_plot <- Heatmap(my_cleaned_data)
    print(my_plot)
    dev.off()
}

How do I incorporate the name of my experiment as a variable so that when I call

analyze(experimentA, genesA) 

I get an output of experimentA_and_genesA_plot.pdf

r • 531 views
ADD COMMENT
1
Entering edit mode
3 months ago
darklings ▴ 320
pdf(paste0(experiment, "_and_", genes, "_plot.pdf"))
ADD COMMENT
0
Entering edit mode

Thanks so much, but when I try that I get

Error in pdf(paste0(experiment, "_and_", genes, "_plot.pdf")) : 
  filename too long in pdf()
ADD REPLY
0
Entering edit mode
pdf(paste0(substr(experiment,1,10), "_and_", substr(genes,1,10), "_plot.pdf"))
ADD REPLY
0
Entering edit mode

Thank you, but now it's printing the first entries from experiment and genes into the pdf name.

I think the use of experiment and genes is calling these data frames, while I just want to call their names. Any ideas how to solve this?

ADD REPLY
0
Entering edit mode

you can print the full name to see what it looks like

ADD REPLY
0
Entering edit mode

Any idea how I would do that?

ADD REPLY
0
Entering edit mode

print(paste0(experiment, "_and_", genes, "_plot.pdf")) ?

What are your inputs? we have no idea what you did with them before that

ADD REPLY
1
Entering edit mode
3 months ago

Hi YanO!

Reading your post, I have some ideas to improve your code.

First of all, if you want to retrieve a plot as the final output of your function you should use:

analyze <- function(experiment, genes) {
# other commands here...
# then plot and save:
my_plot <- Heatmap(my_cleaned_data)
return(my_plot)
}

Then, once your function is in your R environment, create two vectors containing the name of your experiments and genes:

experiment <- c("experimentA", "experimentB", "experimentC")
genes <- c("geneA", "geneB", "geneC")

Finally, loop over your vectors:

for (i in experiment) {
for (j in genes) {
pdf(paste0("..path_to_save_your_files/", i, "_", j, "_plot", ".pdf"))
print(analyze(experiment = get(i), genes = get(j)))
dev.off()
}
}

Hope it helps!

Best regards

Rodo

ADD COMMENT
0
Entering edit mode
3 months ago
seidel 8.3k

You're handing your function data structures, and you'd like it to be able to know the names of those data structures so that it can create a file name. I don't know how (or if) a function can know internally, the name of the variable it was called with, however, a simple solution would be to supply another argument for the file name.

analyze(experimentA, genesA, pdfname="experimentA_genesA_plot.pdf")

Or supply them as strings so that you can use paste() to create a name.

analyze(experimentA, genesA, "experimentA", "genesA")

Alternatively, there are functions that allow you to "get" data into a variable of a specific name, or "assign" a known name to a variable. This can be useful, for instance, if your data is in a file with a given name and you'd like to create data structures based on that name, or if you have have the names of data structures in your code, and you want to read data into a structure with that name. Look at the help for get() and assign(). In fact that might help you in this case:

# some data from experiment A
experimentA <- rnorm(5)

# function to analyze data
analyze <- function(x){
   # get the data named by the string x
    y <- get(x)
    cat(y, "\n")
    cat("mean: ", mean(y), "\n")
    # x is just a string so I can create a file name with it
    plotname <- paste0(x, "_plot.pdf")
    cat(plotname, "\n")
}

# analyze data in a variable called experiment A
analyze("experimentA")

Result:

-0.0193227 0.7361533 1.334033 -0.1113091 0.2921196 
mean: 0.4463348 
experimentA_plot.pdf
ADD COMMENT

Login before adding your answer.

Traffic: 2618 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6