Question: topGo Error in .local(.Object, ...) : allGenes must be a factor with 2 levels
0
gravatar for Ric
2.2 years ago by
Ric280
Australia
Ric280 wrote:

Hi, I have got an Error in .local(.Object, ...) : allGenes must be a factor with 2 levels using the following code

> library(edgeR)
> library(topGO)
...
> tr <- glmTreat(fit, contrast=B.LvsP, lfc=log2(1.5))
> topTags(tr)
Coefficient:  -1*Leaves.2 1*Leaves.3 
          logFC unshrunk.logFC logCPM  PValue     FDR
sp0090975   2.6            2.6    2.8 5.7e-14 2.7e-09
sp0037632  -3.0           -3.0    3.4 1.7e-13 2.7e-09
sp0074153  -3.9           -3.9    3.8 1.8e-13 2.7e-09
sp0008306   3.2            3.2    2.9 1.8e-13 2.7e-09
sp0073530  -4.5           -4.5    3.4 2.3e-12 2.3e-08
sp0025713  -3.9           -3.9    4.4 2.6e-12 2.3e-08
sp0037721   7.8            8.0    2.4 2.7e-12 2.3e-08
sp0083660   2.0            2.0    4.4 3.2e-12 2.3e-08
sp0052245  -2.9           -2.9    4.9 3.9e-12 2.6e-08
sp0071520  -3.3           -3.3    2.7 5.8e-12 3.4e-08


> geneID2GO <- readMappings(file = "~GOmapping.tsv")
> head(geneID2GO)
$`"V1"`
[1] "\"V14\""

$`"sp0000005"`
[1] "\"GO:0003723\""

$`"sp0000006"`
[1] "\"GO:0016021\""

$`"sp0000007"`
[1] "\"GO:0003700" "GO:0006355"   "GO:0043565\""

$`"sp0000016"`
[1] "\"GO:0046983\""

$`"sp0000017"`
[1] "\"GO:0004672" "GO:0005524"   "GO:0006468\""

> geneUniverse <- names(geneID2GO)
> head(geneUniverse)
[1] "\"V1\""        "\"sp0000005\"" "\"sp0000006\"" "\"sp0000007\"" "\"sp0000016\""
[6] "\"sp0000017\""

> genesOfInterest <- rownames(tr@.Data[[1]])
> head(genesOfInterest)
[1] "sp0025247" "sp0025250" "sp0025268" "sp0025270" "sp0025282" "sp0056834"

> geneList <- factor(as.integer(geneUniverse %in% genesOfInterest))

> head(geneList)
[1] 0 0 0 0 0 0
Levels: 0

> names(geneList) <- geneUniverse
> myGOdata <- new("topGOdata", 
+                 description="My project", 
+                 ontology="BP", 
+                 allGenes=geneList,  
+                 annot = annFUN.gene2GO, 
+                 gene2GO = geneID2GO)
Error in .local(.Object, ...) : allGenes must be a factor with 2 levels

What did I miss?

Thank you in advance.

edger rna-seq topgo R • 2.3k views
ADD COMMENTlink modified 2.2 years ago • written 2.2 years ago by Ric280
1

Are you sure that the line geneList <- factor(as.integer(geneUniverse %in% genesOfInterest)) is doing what you expect it to do? First converting to integers and then factors doesn't look right to me. I'm neither sure that the lookup with %in% is going to work as you'd expect. The %in% operator will just return indices, but not actual values.

Your geneList variable appears to be just zeros, i.e., a single factor.

Happy to assist further

ADD REPLYlink modified 2.2 years ago • written 2.2 years ago by Kevin Blighe51k
1
gravatar for Ric
2.2 years ago by
Ric280
Australia
Ric280 wrote:
> genesOfInterest <- rownames(tr@.Data[[1]])
> head(genesOfInterest)
[1] "sp0025247" "sp0025250" "sp0025268" "sp0025270" "sp0025282" "sp0056834"
> geneList <- factor(as.integer(grep(genesOfInterest, geneUniverse)))
Warning message:
In grep(genesOfInterest, geneUniverse) :
  argument 'pattern' has length > 1 and only the first element will be used
> head(geneList)
factor(0)
Levels: 
> names(geneList) <- geneUniverse
Error in names(geneList) <- geneUniverse : 
  'names' attribute [44620] must be the same length as the vector [0]

Your second option geneUniverse <- gsub("\\\"", "", geneUniverse)) will be used with grep together?

ADD COMMENTlink written 2.2 years ago by Ric280

Indeed, apologies, grep doesn't work with multiple values.

The grep code should be:

indices <- c()

for (i in 1:length(genesOfInterest)) indices <- c(indices, grep(genesOfInterest[i], geneUniverse))

geneList <- factor(as.integer(indices))

The second option should work without grep.

ADD REPLYlink modified 2.2 years ago • written 2.2 years ago by Kevin Blighe51k

I have got 'names' attribute [44620] must be the same length as the vector [33975]

> genesOfInterest <- rownames(tr@.Data[[1]])
> head(genesOfInterest)
[1] "sp0025247" "sp0025250" "sp0025268" "sp0025270" "sp0025282" "sp0056834"
> indices <- c()
> for (i in 1:length(genesOfInterest)) indices <- c(indices, grep(genesOfInterest[i], geneUniverse))
> geneList <- factor(as.integer(indices))
> head(geneList)
[1] 12064 12076 26933 18988 18992 18994
33975 Levels: 2 3 4 5 6 8 10 11 12 13 14 15 18 19 22 23 24 25 26 28 29 30 31 32 33 34 35 ... 44618
> names(geneList) <- geneUniverse
Error in names(geneList) <- geneUniverse : 
  'names' attribute [44620] must be the same length as the vector [33975]
ADD REPLYlink written 2.2 years ago by Ric280
1

I think that your command > names(geneList) <- geneUniverse should be

names(geneList) <- geneUniverse[indices]

Nevertheless, in your other thread, which appears to be a duplicate of this, didn't you say that the geneList vector needs to have P values and then names as the gene names?

ADD REPLYlink modified 2.2 years ago • written 2.2 years ago by Kevin Blighe51k
1

Thank you for your help. Yes, I found another tutorial here which took a slightly different approach.

ADD REPLYlink modified 2.2 years ago • written 2.2 years ago by Ric280
0
gravatar for Ric
2.2 years ago by
Ric280
Australia
Ric280 wrote:

I have followed this tutorial for topGO. Only thing which I changed was:

> genesOfInterest <- read.table("interestinggenes.txt",header=FALSE)
> genesOfInterest <- as.character(genesOfInterest$V1)

to

> genesOfInterest <- rownames(tr@.Data[[1]])

Do you have any idea how to fix the geneList?

Thank you in advance.

ADD COMMENTlink modified 2.2 years ago • written 2.2 years ago by Ric280

I see, so that function is indeed just looking for indices (and not values).

It just looks like your matching is not working.

For example, "\"sp0000005\"" is not the same as "sp0000005"

a <- c("\"sp0000005\"")
b <- c("sp0000005")

a %in% b
[1] FALSE

b %in% a
[1] FALSE

...however, grep returns indices too and only looks for partial matches (which can be risky at times):

grep(b, a)
[1] 1
ADD REPLYlink modified 2.2 years ago • written 2.2 years ago by Kevin Blighe51k

I found another tutorial here but I do not know how to use it only in R

ADD REPLYlink written 2.2 years ago by Ric280

Dear Ric,

I think that you just need to change the following line

geneList <- factor(as.integer(geneUniverse %in% genesOfInterest))

to:

EDIT: see next answer thread below

The other option is to add a new line of code after geneUniverse <- names(geneID2GO) that ensures that your variables in geneUniverse and genesOfInterest have the same name:

geneUniverse <- gsub("\\\"", "", geneUniverse))
ADD REPLYlink modified 2.2 years ago • written 2.2 years ago by Kevin Blighe51k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1743 users visited in the last hour