Question: Looping over list in R
0
gravatar for rodolfo.peacewalker
4 weeks ago by
rodolfo.peacewalker0 wrote:

Hi!

I have an issue related to looping over a list containing 54 dataframes (in this case, converted as tibble). The list content looks like:

> result
$`sample1prnk _ mir.db`
# A tibble: 2,368 x 8
   pathway                     pval     padj     ES   NES nMoreExtreme  size leadingEdge
   <chr>                      <dbl>    <dbl>  <dbl> <dbl>        <dbl> <int> <list>     
 1 MIR4795_3P              0.000114 0.000300 -0.432 -2.03            0   483 <chr [255]>
 2 MIR5696                 0.000114 0.000300 -0.495 -2.33            0   493 <chr [303]>
 3 MIR4659A_3P_MIR4659B_3P 0.000114 0.000300 -0.526 -2.47            0   479 <chr [260]>
 4 MIR7_1_3P               0.000114 0.000300 -0.549 -2.59            0   491 <chr [297]>
 5 MIR7_2_3P               0.000114 0.000300 -0.548 -2.58            0   491 <chr [297]>
 6 MIR3671                 0.000114 0.000300 -0.530 -2.49            0   463 <chr [298]>
 7 MIR1468_3P              0.000113 0.000300 -0.526 -2.48            0   496 <chr [279]>
 8 MIR548N                 0.000114 0.000300 -0.553 -2.60            0   476 <chr [264]>
 9 MIR4328                 0.000115 0.000300 -0.518 -2.42            0   445 <chr [228]>
10 MIR548H_3P_MIR548Z      0.000113 0.000300 -0.503 -2.37            0   500 <chr [298]>

$`sample1prnk _ positional.db`
# A tibble: 221 x 8
   pathway     pval   padj     ES    NES nMoreExtreme  size leadingEdge
   <chr>      <dbl>  <dbl>  <dbl>  <dbl>        <dbl> <int> <list>     
 1 chr10p11 0.0548  0.121  -0.503 -1.47           338    22 <chr [13]> 
 2 chr10p12 0.00248 0.0196 -0.566 -1.82            15    33 <chr [16]> 
 3 chr10p13 0.0133  0.0483 -0.564 -1.66            81    23 <chr [8]>  
 4 chr10p15 0.440   0.590   0.305  1.01          1609    27 <chr [2]>  
 5 chr10q11 0.0538  0.120  -0.423 -1.42           349    41 <chr [8]>  
 6 chr10q21 0.0104  0.0432 -0.533 -1.66            65    29 <chr [11]> 
 7 chr10q22 0.615   0.727   0.219  0.928         1872    80 <chr [7]>  
 8 chr10q23 0.00254 0.0196 -0.481 -1.73            16    57 <chr [18]> 
 9 chr10q24 0.876   0.913  -0.198 -0.777         6211    96 <chr [27]> 
10 chr10q25 0.0107  0.0438 -0.513 -1.65            68    33 <chr [19]> 
# … with 211 more rows

The operation I want to execute is to sort the each tibble respect to NES value and then subset data with padj < 0.05. For this purpose, I'm using arrange(desc(NES)) and filter (padj < 0.05) dplyr functions
For one element of the list I ran result[[1]] %>% arrange(desc(NES)) %>% filter(padj < 0.05) or result$`sample1prnk _ mir.db` %>% arrange(desc(NES)) %>% filter(padj < 0.05) and the output was as I expected. However, when I'm trying to loop the operation using:

for (i in 1:length(result)) {
 result[[i]] %>% arrange(desc(NES)) %>% filter(padj < 0.05)
}

nothing happens.

I need your help to solve this issue!

Rodo.

rna-seq list loop R • 148 views
ADD COMMENTlink modified 4 weeks ago by rpolicastro2.3k • written 4 weeks ago by rodolfo.peacewalker0
1
gravatar for rpolicastro
4 weeks ago by
rpolicastro2.3k
rpolicastro2.3k wrote:

Using map (the tidyverse equivalent to lapply) and an anonymous function makes this pretty easy.

library("tidyverse")

result <- map(result, ~filter(.x, padj < 0.05) %>% arrange(desc(NES)))

The equivalent in base R using lapply.

result <- lapply(result, function(x) {
  x <- x[x$padj < 0.05, ]
  x <- x[order(x$NES, decreasing=TRUE), ]
})

If you have a lot of data, the data.table library will be quicker.

library("data.table")

result <- lapply(result, function(x) {
  setDT(x)
  x <- x[padj < 0.05][order(-NES)]
})
ADD COMMENTlink modified 4 weeks ago • written 4 weeks ago by rpolicastro2.3k

rpolicastro, thanks for your help. The map function worked well! Now I have an issue related with plotting the data. I want to generate barplots to visualize the data in the tibbles, respect to NES value. I want to create multiple plots by running:

samps <- as.vector(names(result))

for (i in samps) {
  pdf(paste0(i, ".pdf"), width = 1000, height = 800)
  plt <- barplot.NES(result[[i]])
  dev.off()
}

However, dammaged pdf files are generated.

ADD REPLYlink written 4 weeks ago by rodolfo.peacewalker0

In your current code you are saving the plot to a variable but never rendering it into the file. Also, the default units for pdf are inches I believe, so you may be making a figure that is 83 feet x 67 feet.

samps <- names(result)

for (i in samps) {
  pdf(paste0(i, ".pdf"), width = 10, height = 8)
  print(barplot.NES(result[[i]]))
  dev.off()
}

There's also another convenience function in purrr called iwalk that is designed for use cases like this. It iterates over a list items and names at the same time.

library("tidyverse")

iwalk(result, function(x, y) {
  pdf(str_c(y, ".pdf"), width=10, height=8)
  print(barplot.NES(x))
  dev.off()
})
ADD REPLYlink modified 4 weeks ago • written 4 weeks ago by rpolicastro2.3k

Thanks for your help! I have run both codes but there is an error message as output. For the first one:

> for (i in samps) {
+   pdf(paste0(i, ".pdf"), width = 10, height = 8)
+   print(barplot.NES(result[[i]]))
+   dev.off()
+ }
Error: Faceting variables must have at least one value

I guess that is due to the class of the samps object.

ADD REPLYlink written 4 weeks ago by rodolfo.peacewalker0

Finally, I resolved this issue.

Some of my tibbles were empty.

ADD REPLYlink written 28 days ago by rodolfo.peacewalker0
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 2239 users visited in the last hour