How to rank a gene list using correlation coefficient and p-values for GSEA/clusterProfiler?
0
1
Entering edit mode
7 days ago
c-wes ▴ 10

I have two datasets: a bulk RNAseq dataset for different samples, and a viability dataset with scores giving the effect of a treatment on sample.

From this dataset, I calculated the correlation of the different gene expression profiles with the viability scores for each treatment. So for each treatment, I have a list of genes and their correlation with the treatment-viability, and a p-value.

I want to rank this list as input for GSEA (previously used gseKEGG in ClusterProfiler). What is the best way to rank this gene list?

Possibilities I could imagine:

  • correlation coefficient (from +1 to -1)
  • sign(correlation coefficient) * -log10(p-value)
  • p-value (from 0 to 1)

Using solely the correlation coefficient would be easier to interpret, as a high correlation would directly correspond to the effect of treatment on viability (gene correlated with sensitivity or resistance to treatment)

I would not really know how to interpret solely the p-value.

And using the combination formula with correlation coeff. and p-value would become more interpretable, but also a bit messy (gene significantly correlated with sensitivity or resistance to treatment). I drew on this possible formula, as I've seen it used for ranking DE (sign(FoldChange) * -log10(p) ), but I've also seen some critique of this as well.

Overall, I can't really find a good source for how to rank genes for GSEA outside of differential expression.

ClusterProfiler GSEA correlation rank genelist • 285 views
ADD COMMENT

Login before adding your answer.

Traffic: 3892 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6