Question: RNA Seq data with GSEA
gravatar for pennakiza
5 months ago by
pennakiza50 wrote:

Hi all,

I am a bit confused with how I should be transforming my data to use it with GSEA. I have tried two different approaches and I am not sure which one is the best.

First, I have created a DGElist, filtered my low count reads and transformed it with voom. These data I fed then to GSEA.

y <- DGEList(counts = fc_mydata$counts, genes=fc_mydata$annotation[,c("GeneID","Length")])
y<-calcNormFactors(y, method="TMM")
 write.table(file="GSEA_table_final.txt",cbind("NAME"=symb,GSEA_table),row.names = F,quote = F,sep = "\t")

Second approach, I ran DESeq2 on raw counts, transformed it with vsn and fed that to GSEA.

dds<-DESeqDataSetFromMatrix(countData=fc_mydata$counts, colData=mydata, ~Treatment)
res_ordered<- res_ordered[order(res_ordered$padj, decreasing = F),]
rownames(res_ordered)<-make.names(annotation$Symbol[match(rownames(res_ordered), annotation$GeneID)], unique=TRUE)
write.table(res_ordered, "res_ordered",sep="\t")

vst<-vst(dds, blind=F)
rownames(normalised_vst)<-make.names(annotation$Symbol[match(rownames(normalised_vst), annotation$GeneID)], unique=TRUE)
write.table(normalised_vst, "normalised_vst.txt", cbind("NAME" = rownames(normalised_vst), normalised_vst),row.names = F,quote = F,sep = "\t")

What do you think, which is the most correct one? What can I do to improve?

Thanks a lot!

gsea rna-seq vsn voom • 323 views
ADD COMMENTlink modified 5 months ago by Gordon Smyth1.7k • written 5 months ago by pennakiza50
gravatar for Gordon Smyth
5 months ago by
Gordon Smyth1.7k
Gordon Smyth1.7k wrote:

I am the voom author, but I do not recommend voom for this purpose because GSEA is unable to use the voom precision weights. Just use:

y <- DGEList(counts ...)
y <- calcNormFactors(y)
GSEA_table <- cpm(y, log=TRUE)

You do not need filterByExpr here but, if you do use it, it should be used in conjunction with your treatment factor or design matrix.

As an alternative to GSEA, you might also consider the pathway analysis functions such as camera that come as part of limma.

ADD COMMENTlink modified 5 months ago • written 5 months ago by Gordon Smyth1.7k

Thank you Gordon! I will try camera too!

ADD REPLYlink written 5 months ago by pennakiza50
gravatar for dsull
5 months ago by
dsull1.4k wrote:

Look at the instructions here:

Using DESeq2 is fine (and other types of acceptable RNA-seq normalizations should also be fine).

You also may not need to vst-transform your normalized counts, per the following FAQ:

DESeq2 also gives you the log fold changes (treatment compared to control) of each gene. You can just use the log fold changes to rank your gene list and then feed that ranked gene list into GSEAPreranked algorithm. This is what I generally do. If you do this, I'd recommend using lfcShrink to get more accurate log fold change values.

ADD COMMENTlink modified 5 months ago • written 5 months ago by dsull1.4k

Thank you very much!

I thought that you needed to transform the data before but I will certainly try it without it. I have tried to used preranked but in some cases I have only a few DE genes (p-adj <0.05) and if I only use those I don't get any gene sets enriched.

ADD REPLYlink written 5 months ago by pennakiza50

Ah, for the prerank algorithm, you still use ALL your genes, not just your DE ones. So if 200 out of 40000 genes are DE, you still use all 40000 genes. Basically, you just supply a list of ALL genes ranked by their log fold change.

That's the power of GSEA -- you use every single gene and it's not based off of cutoffs/thresholds. Contrast this to gene ontology overrepresentation analysis where you only input a list of genes that are considered DE.

ADD REPLYlink modified 5 months ago • written 5 months ago by dsull1.4k

That's cool, in this case I am getting the same results, which is reassuring!

Thank you very much for your help!

ADD REPLYlink written 5 months ago by pennakiza50
Please log in to add an answer.


Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 641 users visited in the last hour