Question: TopGO Error in .local(.Object, ...) : allGenes must be a factor with 2 levels
0
gravatar for jomagrax
9 days ago by
jomagrax0
Spain
jomagrax0 wrote:

Hi Im Jose! This is my first time using TopGO and Im having problems generating the GOdata object in R, thank you all in advance, This is the code Im using

 # 1. Data preparation: List of genes identifiers, gene scores, list of differentially expressed genes,  gene-to-GO annotations are all collected and stored in a single R object.

 > annot_GO <- read_delim("E:/VESCA/gen_GO.txt", "\t", escape_double = FALSE, col_names = FALSE, trim_ws = TRUE)
Parsed with column specification:
cols(
  X1 = col_character(),
  X2 = col_character()
)

> annot_GO
# A tibble: 32,832 x 2
   X1                    X2                              
   <chr>                 <chr>                           
 1 locusName             GO                              
 2 gene00090-v1.0-hybrid NA                              
 3 gene00091-v1.0-hybrid GO:0003677,GO:0046983                                       
# ... with 32,822 more rows



 > # create a list of GO terms

    > geneID2GO <- as.list(as.character(annot_GO$X1)) # generates list; element names are transcript IDs
    > geneID2GO <- as.list(setNames(as.character(annot_GO$X2), as.character(annot_GO$X1))) # adds Gene Ontology data to list
    > geneID2GO <- lapply(geneID2GO, function(x) unlist(strsplit(x, split="[,]"))) # split single GO terms string into a character vector, one element per term
    > str(head(geneID2GO))
    List of 6
     $ locusName            : chr "GO"
     $ gene00090-v1.0-hybrid: chr NA
     $ gene00091-v1.0-hybrid: chr [1:2] "GO:0003677" "GO:0046983"



> # make full list of transcript names, geneNames

> geneNames <- names(geneID2GO)
> head(geneNames)
[1] "locusName"             "gene00090-v1.0-hybrid" "gene00091-v1.0-hybrid" "gene00092-v1.0-hybrid" "gene00093-v1.0-hybrid"
[6] "gene00094-v1.0-hybrid"


> head(MyInterestingGenes1)
    [1] "77981546__"           "CL11544Contig1__"     "CL3CG7R__"            "CL8558Contig1__"      "contig00421__258___5"
    [6] "contig01716__233___6"


> #List of all genes
> geneList_1 <- factor(as.integer(geneNames %in% MyInterestingGenes1)) 
> str(geneList_1) 
 Factor w/ 1 level "0": 1 1 1 1 1 1 1 1 1 1 ...
> head(geneList_1)
            locusName gene00090-v1.0-hybrid gene00091-v1.0-hybrid gene00092-v1.0-hybrid gene00093-v1.0-hybrid 
                    0                     0                     0                     0                     0 
gene00094-v1.0-hybrid 
                    0 
Levels: 0


> #Creation of "GOdata object"
> GOdata_1 <- new("topGOdata", ontology = "MF", allGenes = geneList_1, annot = annFUN.gene2GO, nodeSize=5, gene2GO = geneID2GO)
Error in .local(.Object, ...) : allGenes must be a factor with 2 levels

"MyInterestingGenes1" come from a DESeq2 analysis after a kallisto mapping

As much as I know, I understand that the problem is that none of the genes in "MyInterestingGenes1" match with the ones in "geneNames" thats why the factor "geneList_1" don't have any level.

Perhaps you can help me to figure this out.

rna-seq topgo R • 56 views
ADD COMMENTlink written 9 days ago by jomagrax0

Note that topGO expects that what you called geneNames is a large set, which comprises several genes, including those present in geneList_1. In your case looks like the two objects are totally different, that's why topGO doesn't work. Maybe you can look to this thread (which was related to antoher issue) and try to reproduce it to get acquainted to the way of operating of topGO.

ADD REPLYlink written 9 days ago by Fabio Marroni2.0k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 834 users visited in the last hour