Looping over list in R
1
1
Entering edit mode
3.5 years ago

Hi!

I have an issue related to looping over a list containing 54 dataframes (in this case, converted as tibble). The list content looks like:

> result
$`sample1prnk _ mir.db`
# A tibble: 2,368 x 8
   pathway                     pval     padj     ES   NES nMoreExtreme  size leadingEdge
   <chr>                      <dbl>    <dbl>  <dbl> <dbl>        <dbl> <int> <list>     
 1 MIR4795_3P              0.000114 0.000300 -0.432 -2.03            0   483 <chr [255]>
 2 MIR5696                 0.000114 0.000300 -0.495 -2.33            0   493 <chr [303]>
 3 MIR4659A_3P_MIR4659B_3P 0.000114 0.000300 -0.526 -2.47            0   479 <chr [260]>
 4 MIR7_1_3P               0.000114 0.000300 -0.549 -2.59            0   491 <chr [297]>
 5 MIR7_2_3P               0.000114 0.000300 -0.548 -2.58            0   491 <chr [297]>
 6 MIR3671                 0.000114 0.000300 -0.530 -2.49            0   463 <chr [298]>
 7 MIR1468_3P              0.000113 0.000300 -0.526 -2.48            0   496 <chr [279]>
 8 MIR548N                 0.000114 0.000300 -0.553 -2.60            0   476 <chr [264]>
 9 MIR4328                 0.000115 0.000300 -0.518 -2.42            0   445 <chr [228]>
10 MIR548H_3P_MIR548Z      0.000113 0.000300 -0.503 -2.37            0   500 <chr [298]>

$`sample1prnk _ positional.db`
# A tibble: 221 x 8
   pathway     pval   padj     ES    NES nMoreExtreme  size leadingEdge
   <chr>      <dbl>  <dbl>  <dbl>  <dbl>        <dbl> <int> <list>     
 1 chr10p11 0.0548  0.121  -0.503 -1.47           338    22 <chr [13]> 
 2 chr10p12 0.00248 0.0196 -0.566 -1.82            15    33 <chr [16]> 
 3 chr10p13 0.0133  0.0483 -0.564 -1.66            81    23 <chr [8]>  
 4 chr10p15 0.440   0.590   0.305  1.01          1609    27 <chr [2]>  
 5 chr10q11 0.0538  0.120  -0.423 -1.42           349    41 <chr [8]>  
 6 chr10q21 0.0104  0.0432 -0.533 -1.66            65    29 <chr [11]> 
 7 chr10q22 0.615   0.727   0.219  0.928         1872    80 <chr [7]>  
 8 chr10q23 0.00254 0.0196 -0.481 -1.73            16    57 <chr [18]> 
 9 chr10q24 0.876   0.913  -0.198 -0.777         6211    96 <chr [27]> 
10 chr10q25 0.0107  0.0438 -0.513 -1.65            68    33 <chr [19]> 
# … with 211 more rows

The operation I want to execute is to sort the each tibble respect to NES value and then subset data with padj < 0.05. For this purpose, I'm using arrange(desc(NES)) and filter (padj < 0.05) dplyr functions
For one element of the list I ran result[[1]] %>% arrange(desc(NES)) %>% filter(padj < 0.05) or result$`sample1prnk _ mir.db` %>% arrange(desc(NES)) %>% filter(padj < 0.05) and the output was as I expected. However, when I'm trying to loop the operation using:

for (i in 1:length(result)) {
 result[[i]] %>% arrange(desc(NES)) %>% filter(padj < 0.05)
}

nothing happens.

I need your help to solve this issue!

Rodo.

RNA-Seq Loop List R • 1.2k views
ADD COMMENT
5
Entering edit mode
3.5 years ago

Using map (the tidyverse equivalent to lapply) and an anonymous function makes this pretty easy.

library("tidyverse")

result <- map(result, ~filter(.x, padj < 0.05) %>% arrange(desc(NES)))

The equivalent in base R using lapply.

result <- lapply(result, function(x) {
  x <- x[x$padj < 0.05, ]
  x <- x[order(x$NES, decreasing=TRUE), ]
})

If you have a lot of data, the data.table library will be quicker.

library("data.table")

result <- lapply(result, function(x) {
  setDT(x)
  x <- x[padj < 0.05][order(-NES)]
})
ADD COMMENT
0
Entering edit mode

rpolicastro, thanks for your help. The map function worked well! Now I have an issue related with plotting the data. I want to generate barplots to visualize the data in the tibbles, respect to NES value. I want to create multiple plots by running:

samps <- as.vector(names(result))

for (i in samps) {
  pdf(paste0(i, ".pdf"), width = 1000, height = 800)
  plt <- barplot.NES(result[[i]])
  dev.off()
}

However, dammaged pdf files are generated.

ADD REPLY
1
Entering edit mode

In your current code you are saving the plot to a variable but never rendering it into the file. Also, the default units for pdf are inches I believe, so you may be making a figure that is 83 feet x 67 feet.

samps <- names(result)

for (i in samps) {
  pdf(paste0(i, ".pdf"), width = 10, height = 8)
  print(barplot.NES(result[[i]]))
  dev.off()
}

There's also another convenience function in purrr called iwalk that is designed for use cases like this. It iterates over a list items and names at the same time.

library("tidyverse")

iwalk(result, function(x, y) {
  pdf(str_c(y, ".pdf"), width=10, height=8)
  print(barplot.NES(x))
  dev.off()
})
ADD REPLY
0
Entering edit mode

Thanks for your help! I have run both codes but there is an error message as output. For the first one:

> for (i in samps) {
+   pdf(paste0(i, ".pdf"), width = 10, height = 8)
+   print(barplot.NES(result[[i]]))
+   dev.off()
+ }
Error: Faceting variables must have at least one value

I guess that is due to the class of the samps object.

ADD REPLY
0
Entering edit mode

Finally, I resolved this issue.

Some of my tibbles were empty.

ADD REPLY
0
Entering edit mode

iwalk is truly a stuff of beauty.

Thanks for pointing it out.

ADD REPLY

Login before adding your answer.

Traffic: 2039 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6