Looping over list in R
13 months ago

Hi!

I have an issue related to looping over a list containing 54 dataframes (in this case, converted as tibble). The list content looks like:

> result
$sample1prnk _ mir.db # A tibble: 2,368 x 8 pathway pval padj ES NES nMoreExtreme size leadingEdge <chr> <dbl> <dbl> <dbl> <dbl> <dbl> <int> <list> 1 MIR4795_3P 0.000114 0.000300 -0.432 -2.03 0 483 <chr [255]> 2 MIR5696 0.000114 0.000300 -0.495 -2.33 0 493 <chr [303]> 3 MIR4659A_3P_MIR4659B_3P 0.000114 0.000300 -0.526 -2.47 0 479 <chr [260]> 4 MIR7_1_3P 0.000114 0.000300 -0.549 -2.59 0 491 <chr [297]> 5 MIR7_2_3P 0.000114 0.000300 -0.548 -2.58 0 491 <chr [297]> 6 MIR3671 0.000114 0.000300 -0.530 -2.49 0 463 <chr [298]> 7 MIR1468_3P 0.000113 0.000300 -0.526 -2.48 0 496 <chr [279]> 8 MIR548N 0.000114 0.000300 -0.553 -2.60 0 476 <chr [264]> 9 MIR4328 0.000115 0.000300 -0.518 -2.42 0 445 <chr [228]> 10 MIR548H_3P_MIR548Z 0.000113 0.000300 -0.503 -2.37 0 500 <chr [298]>$sample1prnk _ positional.db
# A tibble: 221 x 8
<chr>      <dbl>  <dbl>  <dbl>  <dbl>        <dbl> <int> <list>
1 chr10p11 0.0548  0.121  -0.503 -1.47           338    22 <chr [13]>
2 chr10p12 0.00248 0.0196 -0.566 -1.82            15    33 <chr [16]>
3 chr10p13 0.0133  0.0483 -0.564 -1.66            81    23 <chr [8]>
4 chr10p15 0.440   0.590   0.305  1.01          1609    27 <chr [2]>
5 chr10q11 0.0538  0.120  -0.423 -1.42           349    41 <chr [8]>
6 chr10q21 0.0104  0.0432 -0.533 -1.66            65    29 <chr [11]>
7 chr10q22 0.615   0.727   0.219  0.928         1872    80 <chr [7]>
8 chr10q23 0.00254 0.0196 -0.481 -1.73            16    57 <chr [18]>
9 chr10q24 0.876   0.913  -0.198 -0.777         6211    96 <chr [27]>
10 chr10q25 0.0107  0.0438 -0.513 -1.65            68    33 <chr [19]>
# … with 211 more rows


The operation I want to execute is to sort the each tibble respect to NES value and then subset data with padj < 0.05. For this purpose, I'm using arrange(desc(NES)) and filter (padj < 0.05) dplyr functions
For one element of the list I ran result[[1]] %>% arrange(desc(NES)) %>% filter(padj < 0.05) or result$sample1prnk _ mir.db %>% arrange(desc(NES)) %>% filter(padj < 0.05) and the output was as I expected. However, when I'm trying to loop the operation using: for (i in 1:length(result)) { result[[i]] %>% arrange(desc(NES)) %>% filter(padj < 0.05) }  nothing happens. I need your help to solve this issue! Rodo. RNA-Seq Loop List R • 443 views ADD COMMENT 2 Entering edit mode 13 months ago Using map (the tidyverse equivalent to lapply) and an anonymous function makes this pretty easy. library("tidyverse") result <- map(result, ~filter(.x, padj < 0.05) %>% arrange(desc(NES)))  The equivalent in base R using lapply. result <- lapply(result, function(x) { x <- x[x$padj < 0.05, ]
x <- x[order(x\$NES, decreasing=TRUE), ]
})


If you have a lot of data, the data.table library will be quicker.

library("data.table")

result <- lapply(result, function(x) {
setDT(x)
})

rpolicastro, thanks for your help. The map function worked well! Now I have an issue related with plotting the data. I want to generate barplots to visualize the data in the tibbles, respect to NES value. I want to create multiple plots by running:

samps <- as.vector(names(result))

for (i in samps) {
pdf(paste0(i, ".pdf"), width = 1000, height = 800)
plt <- barplot.NES(result[[i]])
dev.off()
}


However, dammaged pdf files are generated.

In your current code you are saving the plot to a variable but never rendering it into the file. Also, the default units for pdf are inches I believe, so you may be making a figure that is 83 feet x 67 feet.

samps <- names(result)

for (i in samps) {
pdf(paste0(i, ".pdf"), width = 10, height = 8)
print(barplot.NES(result[[i]]))
dev.off()
}


There's also another convenience function in purrr called iwalk that is designed for use cases like this. It iterates over a list items and names at the same time.

library("tidyverse")

iwalk(result, function(x, y) {
pdf(str_c(y, ".pdf"), width=10, height=8)
print(barplot.NES(x))
dev.off()
})

Thanks for your help! I have run both codes but there is an error message as output. For the first one:

> for (i in samps) {
+   pdf(paste0(i, ".pdf"), width = 10, height = 8)
+   print(barplot.NES(result[[i]]))
+   dev.off()
+ }
Error: Faceting variables must have at least one value


I guess that is due to the class of the samps object.

Finally, I resolved this issue.

Some of my tibbles were empty.