Question: Gene set enrichment analysis with logFC and PValue
0
gravatar for Vasu
17 months ago by
Vasu420
Vasu420 wrote:

Hi,

I have RNA-Seq data with 100 samples and 30k genes. Samples as columns and genes as rows. Tumor vs Normal.

After filtering step I see around 19k genes were used for differential analysis. With differential analysis cutoff FC > 2 and FDR < 0.05, there are about 1000 differential expressed genes.

I'm going to use foldchange and Pvalue for ranking the genes and input for GSEA

For Gene set enrichment analysis do I need to use only those 1000 differential expressed genes or do I need to use those 19k genes as input?

thanq

ADD COMMENTlink modified 17 months ago by sangram_keshari230 • written 17 months ago by Vasu420
5
gravatar for h.mon
17 months ago by
h.mon29k
Brazil
h.mon29k wrote:

You have to use the 19k genes, but how you will do so depends on the enrichment method you are using. For GSEA, you have to use the ranked vector of all 19k genes.

ADD COMMENTlink written 17 months ago by h.mon29k

I'm actually using this for GSEA. Do you think ranking genes based on FC and Pvalue like below is right?

x <- read.table("DE_genes.txt",sep = "\t",header = T)
head(x)
x$fcsign <- sign(x$logFC)
x$logP=-log10(x$p_value)
x$metric= x$logP/x$fcsign
y<-x[,c("Gene", "metric")]
head(y)
write.table(y,file="DE_genes.rnk",quote=F,sep="\t",row.names=F)

I will use that DE_genes.rnk as input for GSEA. Could you please tell me something about this. thanq

ADD REPLYlink written 17 months ago by Vasu420
1

This seems fine, it is the same metric as used at Gene Set Enrichment Analysis (GSEA) explained.

ADD REPLYlink written 17 months ago by h.mon29k

if i do it only with p value or adjusted p value would it be logical?

ADD REPLYlink written 7 months ago by krushnach80680
1

Yes it will be logical, but it's just that fold change for few genes might be very less to be called as differentially expressed.

ADD REPLYlink written 7 months ago by sangram_keshari230

Hi h.mon, In the hyperlink provided, it is written as "signed fold change * -log10pvalue" In the above-mentioned comment, the following is used: x$logP/x$fcsign

Are both of them are similar?

ADD REPLYlink written 6 months ago by bhanu.chandra120
0
gravatar for sangram_keshari
17 months ago by
IISER Mohali
sangram_keshari230 wrote:

Hi, Basically, if you want to check on which biological state you gene set belongs its ideal to check some of differentially expressed top up-regulated and down-regulated genes. That number depends upon you. I saw people taking top 100 also and other times the whole set of up and down-regulated genes.

ADD COMMENTlink written 17 months ago by sangram_keshari230
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1054 users visited in the last hour