Question: topGo Error in .local(.Object, ...) : allGenes must be a factor with 2 levels
0
gravatar for Ric
18 months ago by
Ric190
Australia
Ric190 wrote:

Hi, I have got an Error in .local(.Object, ...) : allGenes must be a factor with 2 levels using the following code

> library(edgeR)
> library(topGO)
...
> tr <- glmTreat(fit, contrast=B.LvsP, lfc=log2(1.5))
> topTags(tr)
Coefficient:  -1*Leaves.2 1*Leaves.3 
          logFC unshrunk.logFC logCPM  PValue     FDR
sp0090975   2.6            2.6    2.8 5.7e-14 2.7e-09
sp0037632  -3.0           -3.0    3.4 1.7e-13 2.7e-09
sp0074153  -3.9           -3.9    3.8 1.8e-13 2.7e-09
sp0008306   3.2            3.2    2.9 1.8e-13 2.7e-09
sp0073530  -4.5           -4.5    3.4 2.3e-12 2.3e-08
sp0025713  -3.9           -3.9    4.4 2.6e-12 2.3e-08
sp0037721   7.8            8.0    2.4 2.7e-12 2.3e-08
sp0083660   2.0            2.0    4.4 3.2e-12 2.3e-08
sp0052245  -2.9           -2.9    4.9 3.9e-12 2.6e-08
sp0071520  -3.3           -3.3    2.7 5.8e-12 3.4e-08


> geneID2GO <- readMappings(file = "~GOmapping.tsv")
> head(geneID2GO)
$`"V1"`
[1] "\"V14\""

$`"sp0000005"`
[1] "\"GO:0003723\""

$`"sp0000006"`
[1] "\"GO:0016021\""

$`"sp0000007"`
[1] "\"GO:0003700" "GO:0006355"   "GO:0043565\""

$`"sp0000016"`
[1] "\"GO:0046983\""

$`"sp0000017"`
[1] "\"GO:0004672" "GO:0005524"   "GO:0006468\""

> geneUniverse <- names(geneID2GO)
> head(geneUniverse)
[1] "\"V1\""        "\"sp0000005\"" "\"sp0000006\"" "\"sp0000007\"" "\"sp0000016\""
[6] "\"sp0000017\""

> genesOfInterest <- rownames(tr@.Data[[1]])
> head(genesOfInterest)
[1] "sp0025247" "sp0025250" "sp0025268" "sp0025270" "sp0025282" "sp0056834"

> geneList <- factor(as.integer(geneUniverse %in% genesOfInterest))

> head(geneList)
[1] 0 0 0 0 0 0
Levels: 0

> names(geneList) <- geneUniverse
> myGOdata <- new("topGOdata", 
+                 description="My project", 
+                 ontology="BP", 
+                 allGenes=geneList,  
+                 annot = annFUN.gene2GO, 
+                 gene2GO = geneID2GO)
Error in .local(.Object, ...) : allGenes must be a factor with 2 levels

What did I miss?

Thank you in advance.

edger rna-seq topgo R • 1.6k views
ADD COMMENTlink modified 18 months ago • written 18 months ago by Ric190
1

Are you sure that the line geneList <- factor(as.integer(geneUniverse %in% genesOfInterest)) is doing what you expect it to do? First converting to integers and then factors doesn't look right to me. I'm neither sure that the lookup with %in% is going to work as you'd expect. The %in% operator will just return indices, but not actual values.

Your geneList variable appears to be just zeros, i.e., a single factor.

Happy to assist further

ADD REPLYlink modified 18 months ago • written 18 months ago by Kevin Blighe39k
1
gravatar for Ric
18 months ago by
Ric190
Australia
Ric190 wrote:
> genesOfInterest <- rownames(tr@.Data[[1]])
> head(genesOfInterest)
[1] "sp0025247" "sp0025250" "sp0025268" "sp0025270" "sp0025282" "sp0056834"
> geneList <- factor(as.integer(grep(genesOfInterest, geneUniverse)))
Warning message:
In grep(genesOfInterest, geneUniverse) :
  argument 'pattern' has length > 1 and only the first element will be used
> head(geneList)
factor(0)
Levels: 
> names(geneList) <- geneUniverse
Error in names(geneList) <- geneUniverse : 
  'names' attribute [44620] must be the same length as the vector [0]

Your second option geneUniverse <- gsub("\\\"", "", geneUniverse)) will be used with grep together?

ADD COMMENTlink written 18 months ago by Ric190

Indeed, apologies, grep doesn't work with multiple values.

The grep code should be:

indices <- c()

for (i in 1:length(genesOfInterest)) indices <- c(indices, grep(genesOfInterest[i], geneUniverse))

geneList <- factor(as.integer(indices))

The second option should work without grep.

ADD REPLYlink modified 18 months ago • written 18 months ago by Kevin Blighe39k

I have got 'names' attribute [44620] must be the same length as the vector [33975]

> genesOfInterest <- rownames(tr@.Data[[1]])
> head(genesOfInterest)
[1] "sp0025247" "sp0025250" "sp0025268" "sp0025270" "sp0025282" "sp0056834"
> indices <- c()
> for (i in 1:length(genesOfInterest)) indices <- c(indices, grep(genesOfInterest[i], geneUniverse))
> geneList <- factor(as.integer(indices))
> head(geneList)
[1] 12064 12076 26933 18988 18992 18994
33975 Levels: 2 3 4 5 6 8 10 11 12 13 14 15 18 19 22 23 24 25 26 28 29 30 31 32 33 34 35 ... 44618
> names(geneList) <- geneUniverse
Error in names(geneList) <- geneUniverse : 
  'names' attribute [44620] must be the same length as the vector [33975]
ADD REPLYlink written 18 months ago by Ric190
1

I think that your command > names(geneList) <- geneUniverse should be

names(geneList) <- geneUniverse[indices]

Nevertheless, in your other thread, which appears to be a duplicate of this, didn't you say that the geneList vector needs to have P values and then names as the gene names?

ADD REPLYlink modified 18 months ago • written 18 months ago by Kevin Blighe39k
1

Thank you for your help. Yes, I found another tutorial here which took a slightly different approach.

ADD REPLYlink modified 18 months ago • written 18 months ago by Ric190
0
gravatar for Ric
18 months ago by
Ric190
Australia
Ric190 wrote:

I have followed this tutorial for topGO. Only thing which I changed was:

> genesOfInterest <- read.table("interestinggenes.txt",header=FALSE)
> genesOfInterest <- as.character(genesOfInterest$V1)

to

> genesOfInterest <- rownames(tr@.Data[[1]])

Do you have any idea how to fix the geneList?

Thank you in advance.

ADD COMMENTlink modified 18 months ago • written 18 months ago by Ric190

I see, so that function is indeed just looking for indices (and not values).

It just looks like your matching is not working.

For example, "\"sp0000005\"" is not the same as "sp0000005"

a <- c("\"sp0000005\"")
b <- c("sp0000005")

a %in% b
[1] FALSE

b %in% a
[1] FALSE

...however, grep returns indices too and only looks for partial matches (which can be risky at times):

grep(b, a)
[1] 1
ADD REPLYlink modified 18 months ago • written 18 months ago by Kevin Blighe39k

I found another tutorial here but I do not know how to use it only in R

ADD REPLYlink written 18 months ago by Ric190

Dear Ric,

I think that you just need to change the following line

geneList <- factor(as.integer(geneUniverse %in% genesOfInterest))

to:

EDIT: see next answer thread below

The other option is to add a new line of code after geneUniverse <- names(geneID2GO) that ensures that your variables in geneUniverse and genesOfInterest have the same name:

geneUniverse <- gsub("\\\"", "", geneUniverse))
ADD REPLYlink modified 18 months ago • written 18 months ago by Kevin Blighe39k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1021 users visited in the last hour