Question: edgeR and GO analysis using goana
0
gravatar for moxu
3.7 years ago by
moxu460
moxu460 wrote:

I used RSEM to align and quantify RNAseq, then used edgeR to get differentially expressed genes with someting like the following:

library("edgeR");

d0<-read.delim("mydata.txt", row.names="gene_id");
d<-d0[c("c1","c2","t1","t2")];
attach(d);
group<-factor(c(1,1,2,2));
dgeList <- DGEList(counts=d, group=group);
normFac <- calcNormFactors(dgeList); # ERCC normalization
dsgn <- model.matrix(~group);
disp <- estimateDisp(normFac, dsgn);
qlFit <- glmQLFit(disp, dsgn);
qlfTest <- glmQLFTest(qlFit, coef=2);
topTags(qlfTest);

# do GO analysis
go <- goana(qlfTest, specifes="Hs");

The code works up to "topTags(qlfTest)", but stopped at the last line with the following error:

Error in goana.default(qlfTest, specifes = "Hs", de = list(Up = c("A1BG",  : 
  No genes found in universe

Any ideas?

Thanks!

rna-seq next-gen gene • 6.6k views
ADD COMMENTlink modified 3.7 years ago by benformatics1.6k • written 3.7 years ago by moxu460
1

show us a snippet of mydata.txt

ADD REPLYlink written 3.7 years ago by Jeremy Leipzig19k
         c1 c2 t1 t2
A1BG        33.61    35.00    100.26    129.68
A1BG-AS1    33.25    35.13     93.40    112.64
A1CF         1.00     2.06      3.55      4.30
A2M       3843.77  3995.67      4.03      5.03
A2M-AS1     11.23    17.34     65.97     88.99
ADD REPLYlink modified 3.7 years ago • written 3.7 years ago by moxu460

Also it should be species = "Hs" not specifes.

ADD REPLYlink written 3.7 years ago by benformatics1.6k

Thanks for the correction. However, with "species", I got the same error.

ADD REPLYlink written 3.7 years ago by moxu460

And what's the output of topTags(qlfTest)?

ADD REPLYlink written 3.7 years ago by WouterDeCoster43k
> topTags(qlfTest)
Coefficient:  group2 
             logFC    logCPM        F       PValue          FDR
ALDH1A3   7.268353  9.470591 18973.47 1.274423e-28 1.542331e-24
KRT7      7.941622  9.301393 18935.93 1.297276e-28 1.542331e-24
BCAT1    -4.480013 10.361213 17991.53 2.053146e-28 1.627324e-24
FLG     -12.436994  8.134986 15132.03 9.703443e-28 5.768212e-24
SLPI     -4.676582  8.770861 14112.19 1.814653e-27 8.629763e-24
ADD REPLYlink written 3.7 years ago by moxu460
1
gravatar for WouterDeCoster
3.7 years ago by
Belgium
WouterDeCoster43k wrote:

You need to convert your gene identifiers to Entrez gene identifiers. E.g KRT7 is 3855.

I suggest BioMart from Ensembl or https://github.com/stephenturner/annotables

ADD COMMENTlink written 3.7 years ago by WouterDeCoster43k

Great!

I've installed all the packages in the link, then for my edgeR variables, what exactly should I do to be able to run "goana"?

Thanks!

ADD REPLYlink modified 3.7 years ago • written 3.7 years ago by moxu460

You need to perform a join using the required table (e.g. grch38) and your dataframe of differential expressed genes (qlfTest). The dplyr library is great for jobs like this. A few examples are also included on the github page and http://www.gettinggeneticsdone.com/2015/11/annotables-convert-gene-ids.html.

ADD REPLYlink written 3.7 years ago by WouterDeCoster43k
1
gravatar for benformatics
3.7 years ago by
benformatics1.6k
ETH Zurich
benformatics1.6k wrote:

I believe your problem lies in that you are using the wrong objects for the goana function.

You cannot just feed it the qflTest and expected it to work. You need to give it a list of genes.

degs <- topTags(qlfTest)[[1]]
go <- goana(degs, species="Hs");

That should work. Ideally in this case you want to feed it:

de = Your differentially expressed genes (up and/or down)

universe = All expressed genes (all row.names of counts/dge object) or alternatively All annotated genes

ADD COMMENTlink modified 3.7 years ago • written 3.7 years ago by benformatics1.6k

According to the help information of goana:

de: a vector of Entrez Gene IDs, or a list of such vectors, or an "MArrayLM" fit object.

ADD REPLYlink written 3.7 years ago by WouterDeCoster43k

Yes, you are right.

The problem now is that "topGO(go)" invariantly (I have other differential gene expression analyses) returns the same list of GO pathways as the following:

> go <- goana(as.vector(degs), species="Hs");
> topGO(go)
                                                                                                 Term
GO:0000002                                                           mitochondrial genome maintenance
GO:0000003                                                                               reproduction
GO:0000009                                                     alpha-1,6-mannosyltransferase activity
GO:0000010                                                  trans-hexaprenyltranstransferase activity
GO:0000012                                                                 single strand break repair
GO:0000014                                         single-stranded DNA endodeoxyribonuclease activity
GO:0000015                                                          phosphopyruvate hydratase complex
GO:0000016                                                                           lactase activity
GO:0000018                                                            regulation of DNA recombination

Certainly something is wrong.

ADD REPLYlink written 3.7 years ago by moxu460
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1417 users visited in the last hour