Question: edgeR and GO analysis using goana
0
gravatar for moushengxu
12 months ago by
moushengxu260
moushengxu260 wrote:

I used RSEM to align and quantify RNAseq, then used edgeR to get differentially expressed genes with someting like the following:

library("edgeR");

d0<-read.delim("mydata.txt", row.names="gene_id");
d<-d0[c("c1","c2","t1","t2")];
attach(d);
group<-factor(c(1,1,2,2));
dgeList <- DGEList(counts=d, group=group);
normFac <- calcNormFactors(dgeList); # ERCC normalization
dsgn <- model.matrix(~group);
disp <- estimateDisp(normFac, dsgn);
qlFit <- glmQLFit(disp, dsgn);
qlfTest <- glmQLFTest(qlFit, coef=2);
topTags(qlfTest);

# do GO analysis
go <- goana(qlfTest, specifes="Hs");

The code works up to "topTags(qlfTest)", but stopped at the last line with the following error:

Error in goana.default(qlfTest, specifes = "Hs", de = list(Up = c("A1BG",  : 
  No genes found in universe

Any ideas?

Thanks!

rna-seq next-gen gene • 1.7k views
ADD COMMENTlink modified 12 months ago by benformatics150 • written 12 months ago by moushengxu260
1

show us a snippet of mydata.txt

ADD REPLYlink written 12 months ago by Jeremy Leipzig17k
         c1 c2 t1 t2
A1BG        33.61    35.00    100.26    129.68
A1BG-AS1    33.25    35.13     93.40    112.64
A1CF         1.00     2.06      3.55      4.30
A2M       3843.77  3995.67      4.03      5.03
A2M-AS1     11.23    17.34     65.97     88.99
ADD REPLYlink modified 12 months ago • written 12 months ago by moushengxu260

Also it should be species = "Hs" not specifes.

ADD REPLYlink written 12 months ago by benformatics150

Thanks for the correction. However, with "species", I got the same error.

ADD REPLYlink written 12 months ago by moushengxu260

And what's the output of topTags(qlfTest)?

ADD REPLYlink written 12 months ago by WouterDeCoster21k
> topTags(qlfTest)
Coefficient:  group2 
             logFC    logCPM        F       PValue          FDR
ALDH1A3   7.268353  9.470591 18973.47 1.274423e-28 1.542331e-24
KRT7      7.941622  9.301393 18935.93 1.297276e-28 1.542331e-24
BCAT1    -4.480013 10.361213 17991.53 2.053146e-28 1.627324e-24
FLG     -12.436994  8.134986 15132.03 9.703443e-28 5.768212e-24
SLPI     -4.676582  8.770861 14112.19 1.814653e-27 8.629763e-24
ADD REPLYlink written 12 months ago by moushengxu260
1
gravatar for WouterDeCoster
12 months ago by
Belgium
WouterDeCoster21k wrote:

You need to convert your gene identifiers to Entrez gene identifiers. E.g KRT7 is 3855.

I suggest BioMart from Ensembl or https://github.com/stephenturner/annotables

ADD COMMENTlink written 12 months ago by WouterDeCoster21k

Great!

I've installed all the packages in the link, then for my edgeR variables, what exactly should I do to be able to run "goana"?

Thanks!

ADD REPLYlink modified 12 months ago • written 12 months ago by moushengxu260

You need to perform a join using the required table (e.g. grch38) and your dataframe of differential expressed genes (qlfTest). The dplyr library is great for jobs like this. A few examples are also included on the github page and http://www.gettinggeneticsdone.com/2015/11/annotables-convert-gene-ids.html.

ADD REPLYlink written 12 months ago by WouterDeCoster21k
0
gravatar for benformatics
12 months ago by
benformatics150
benformatics150 wrote:

I believe your problem lies in that you are using the wrong objects for the goana function.

You cannot just feed it the qflTest and expected it to work. You need to give it a list of genes.

degs <- topTags(qlfTest)[[1]]
go <- goana(degs, species="Hs");

That should work. Ideally in this case you want to feed it:

de = Your differentially expressed genes (up and/or down)

universe = All expressed genes (all row.names of counts/dge object) or alternatively All annotated genes

ADD COMMENTlink modified 12 months ago • written 12 months ago by benformatics150

According to the help information of goana:

de: a vector of Entrez Gene IDs, or a list of such vectors, or an "MArrayLM" fit object.

ADD REPLYlink written 12 months ago by WouterDeCoster21k

Yes, you are right.

The problem now is that "topGO(go)" invariantly (I have other differential gene expression analyses) returns the same list of GO pathways as the following:

> go <- goana(as.vector(degs), species="Hs");
> topGO(go)
                                                                                                 Term
GO:0000002                                                           mitochondrial genome maintenance
GO:0000003                                                                               reproduction
GO:0000009                                                     alpha-1,6-mannosyltransferase activity
GO:0000010                                                  trans-hexaprenyltranstransferase activity
GO:0000012                                                                 single strand break repair
GO:0000014                                         single-stranded DNA endodeoxyribonuclease activity
GO:0000015                                                          phosphopyruvate hydratase complex
GO:0000016                                                                           lactase activity
GO:0000018                                                            regulation of DNA recombination

Certainly something is wrong.

ADD REPLYlink written 12 months ago by moushengxu260
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 657 users visited in the last hour