Hi, I am relatively new to R so apologies if the code/question is not in the right format! I will to improve! I am trying to perform enrichment analysis with reactomePA (R package) on a smaller list of genes (477) and I have a problem with organizing the dataset. As far as I understood the input file should contain only two-column: Entrez ID (column n.1) and fold change (column n.2). I converted the ensemble ID with Biomart with the online tool, and then created a new file with ID and FC. My dataset:
# A tibble: 6 x 2 Entrezgene_ID log2fc <chr> <dbl> 1 14 -1.02 2 80755 -1.45 3 60496 -1.17 4 6059 -1.48 5 10061 -1.35 6 10006 -1.51
Then I was trying to following this code:
#load packages library(org.Hs.eg.db) library(DOSE) library(ReactomePA) ## feature 1: numeric vector geneList <- d[,2] ## feature 2: named vector names(geneList) <- as.character(d[,1]) ## feature 3: decreasing order geneList<- sort(geneList, decreasing = TRUE) head(geneList)
But when I try to name the vector I obtain with a list of entrez ID separated by comma and no FC (geneList: 477 obs, 1 variable c ("14", "80755",... and so on). I was expecting to found then in rows next to the fold change, Am I wrong? and of course if I try to run to organise in decreasing order ( "feature 3") I got this error because of course I have basically a list of number included in the " " not associated with any numbers :
Error: Can't subset columns that don't exist. x Locations 141, 373, 119, 229, 230, etc. don't exist. i There are only 1 column.
Thank you very much for your help!
Plus, if you already have
tibbleloaded, you can use it's
deframefunction to do this in one step: https://stackoverflow.com/a/56479548/1845650 ;
genes <- tibble::deframe(d). That only works for two-column data-frames: the first column becomes the vector-names and the second column becomes the vector-contents
There are several other ways of doing this mentioned in that SO thread, dplyr::pull for example
so if I use
I should obtain already a vector with my values and their associated "names"?
Hi! Thank you! so the code should be:
because it gives me this error when I use [[ ]] :
and if I do :
I have the same results (and error) that before.
No, the code should be
genes <- d[]; names(genes) <- d[]
On a tibble or data.frame, the
[[function extracts a column as a vector:
`[[`(my_df, column_index). It takes the data.frame and a column index as argument; so when it's used as an operator it should look like