ViSEAGO topGOdata GO analysis error in R. Please help!
0
0
Entering edit mode
3.6 years ago
jpaveley1 ▴ 10

I am trying to perform GO analysis on RNAseq data I have generated in r. My DEG list of interest is called GRL4v3VisData and my background list is all ensembl names of genes that have at least 1 transcript. This should give more power to the statistics rather than all genes on the zebrafish genome.

Install ViseaGO using

if (!requireNamespace("BiocManager", quietly = TRUE))
  install.packages("BiocManager")

BiocManager::install("ViSEAGO")

Upload data tables from wd

GRL4v3VisData<-data.table::fread("GRL4v3ResultsViseaGO.txt", select = c("ensembl","padj"))

background <- data.table::fread("GRBgViseago.txt", select = c("ensembl","padj"))

Input Data: GRL4v3VisData is 2836 observations of 2 variables

> head(GRL4v3VisData)
          ensembl      padj
1: ENSDARG00000028396 1.76e-115
2: ENSDARG00000023587  2.46e-58
3: ENSDARG00000075666  5.46e-47
4: ENSDARG00000043154  3.98e-44
5: ENSDARG00000039682  9.37e-42
6: ENSDARG00000022631  3.62e-40

background is 18726 observations of 1 variable

> head(background)
          ensembl
1: ENSDARG00000113107
2: ENSDARG00000084828
3: ENSDARG00000093924
4: ENSDARG00000102104
5: ENSDARG00000113105
6: ENSDARG00000103050

Create object of all GO annotations from Ensembl

Ensembl <- ViSEAGO::Ensembl2GO(biomart = "genes", host = "www.ensembl.org", version = NULL)

Annotate Zebrafish genome with GO annotations from Ensembl

myGENE2GO<-ViSEAGO::annotate("drerio_gene_ensembl", Ensembl)

Create topGOdata for Biological Processes: with inputs as: genes selection, genes background, #GO terms category used (MF, BP, or CC), and minimum of annotated genes by GO terms (nodeSize).

BP<-ViSEAGO::create_topGOdata(geneSel=GRL4v3VisData, allGenes=background, gene2GO=myGENE2GO, ont="BP", nodeSize=5)

And the result is:

Error in .local(.Object, ...) : allGenes must be a factor with 2 levels

I am wondering if this is because it is comparing the rownames of both files which are integers rather than the exact Ensembl ID? I understand the factors should be 0 or 1 for False or TRUE, but unsure how to resolve this issue.

Please help!

RNA-Seq r R sequencing • 958 views
ADD COMMENT

Login before adding your answer.

Traffic: 1672 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6