Question

Is there a way to test or say if one gene signature is better than another for predicting a phenotype?

0

Entering edit mode

16 months ago

msn ▴ 130

Sorry if this sounds n00b-ish. But lets say you have a drug and treat cells with one and dont treat the other. you do RNAseq and you find some DEG's that change with drug added. is there a way to score how good that signature is?

I assume something related to how many samples you have. if you have 6 cell lines in the drug treated group and 8 in the non-treated group that n would impact it, you could create come sort of z score for the gene set , then score each sample, and then see what percent is statistically higher than the mean of the non-treated group? although the n is already taken into consideration with DGE analysis a little.

what would be the best way to try and say that signature A was more informative of an outcome (or predictive) than say signature B? if you had two signatures

apologies again for n00b question and I have a feeling I am thinking about this in the wrong way but it was a late night discussion and I haven't got a good solid answer in my mind. if this was a continuous variable I think I would have some more ideas, but its binary situation... drug or no drug, disease or no disease, survivor or non-survivor type thing...

thanks in advance!

stats RNA-seq gene-signature • 806 views

ADD COMMENT • link updated 13 months ago by Ram 43k • written 16 months ago by msn ▴ 130

score 0 · Answer 1 · 2023-01-06

0

Entering edit mode

16 months ago

liorglic ★ 1.4k

I think what you are looking for is the log2 fold change (L2FC) and adjusted p-value. Together these measures tell you the "effect size" and statistical significance of a DEG. These are often plotted together as what's called a volcano plot.
If you are looking for a summary statistic across all genes, then I don't know if a standard one exists, but you may simply compare the distributions of L2FC or -log P.

ADD COMMENT • link 16 months ago by liorglic ★ 1.4k

0

Entering edit mode

yeah was more talking about a collection of DE genes ( a gene set if you will ) and how likely that gene set is associated with an outcome. if i have two gene sets for two different outcomes and I want to test for which gene set is better at predicting or even associated with a specific outcome. where ever gene in the gene set is already DE. but obviously some genes are more driven by specific sample or patients etc , and some phenotypes have more power and more samples, surely that will impact how informative a gene signature is and I just need a way to sorta weigh two or more signatures statistically in this thought experiment.

ADD REPLY • link 16 months ago by msn ▴ 130