Looping over a function in R, and needing a variable name to save output as unique files
3
0
Entering edit mode
5 weeks ago
YanO ▴ 140

I have data from three experiments. I want to import the data into R, perform some commands to clean/sort/transform the data etc., then plot a figure and save as a pdf. I need to do this for each of the three experiments. I know enough to know that copy-pasting the code 3 times is a bad idea. I assume the best way to go is to write a function for my cleaning/sorting/transforming and plotting, and then write a loop to call this function 3 times for each of my experiments.

1. What are the 'best practices' for handling a situation like this in R - for calling the same code for multiple experiments?

2. How do I define the name of my experiment as a variable, so that I can incorporate it into the name of the saved output file?

For example, if I write a function:

analyze <- function(experiment, genes) {
# other commands here...
# then plot and save:
pdf("experiment_and_genes_plot.pdf")
my_plot <- Heatmap(my_cleaned_data)
print(my_plot)
dev.off()
}


How do I incorporate the name of my experiment as a variable so that when I call

analyze(experimentA, genesA)


I get an output of experimentA_and_genesA_plot.pdf

r • 453 views
1
Entering edit mode
5 weeks ago
darklings ▴ 280
pdf(paste0(experiment, "_and_", genes, "_plot.pdf"))

0
Entering edit mode

Thanks so much, but when I try that I get

Error in pdf(paste0(experiment, "_and_", genes, "_plot.pdf")) :
filename too long in pdf()

0
Entering edit mode
pdf(paste0(substr(experiment,1,10), "_and_", substr(genes,1,10), "_plot.pdf"))

0
Entering edit mode

Thank you, but now it's printing the first entries from experiment and genes into the pdf name.

I think the use of experiment and genes is calling these data frames, while I just want to call their names. Any ideas how to solve this?

0
Entering edit mode

you can print the full name to see what it looks like

0
Entering edit mode

Any idea how I would do that?

0
Entering edit mode

print(paste0(experiment, "_and_", genes, "_plot.pdf")) ?

What are your inputs? we have no idea what you did with them before that

1
Entering edit mode
5 weeks ago

Hi YanO!

First of all, if you want to retrieve a plot as the final output of your function you should use:

analyze <- function(experiment, genes) {
# other commands here...
# then plot and save:
my_plot <- Heatmap(my_cleaned_data)
return(my_plot)
}


Then, once your function is in your R environment, create two vectors containing the name of your experiments and genes:

experiment <- c("experimentA", "experimentB", "experimentC")
genes <- c("geneA", "geneB", "geneC")


for (i in experiment) {
for (j in genes) {
pdf(paste0("..path_to_save_your_files/", i, "_", j, "_plot", ".pdf"))
print(analyze(experiment = get(i), genes = get(j)))
dev.off()
}
}


Hope it helps!

Best regards

Rodo

0
Entering edit mode
5 weeks ago
seidel 8.2k

You're handing your function data structures, and you'd like it to be able to know the names of those data structures so that it can create a file name. I don't know how (or if) a function can know internally, the name of the variable it was called with, however, a simple solution would be to supply another argument for the file name.

analyze(experimentA, genesA, pdfname="experimentA_genesA_plot.pdf")


Or supply them as strings so that you can use paste() to create a name.

analyze(experimentA, genesA, "experimentA", "genesA")


Alternatively, there are functions that allow you to "get" data into a variable of a specific name, or "assign" a known name to a variable. This can be useful, for instance, if your data is in a file with a given name and you'd like to create data structures based on that name, or if you have have the names of data structures in your code, and you want to read data into a structure with that name. Look at the help for get() and assign(). In fact that might help you in this case:

# some data from experiment A
experimentA <- rnorm(5)

# function to analyze data
analyze <- function(x){
# get the data named by the string x
y <- get(x)
cat(y, "\n")
cat("mean: ", mean(y), "\n")
# x is just a string so I can create a file name with it
plotname <- paste0(x, "_plot.pdf")
cat(plotname, "\n")
}

# analyze data in a variable called experiment A
analyze("experimentA")


Result:

-0.0193227 0.7361533 1.334033 -0.1113091 0.2921196
mean: 0.4463348
experimentA_plot.pdf