Question

Statistical test to compare expression change in a subgroup of genes versus all genes?

0

Entering edit mode

7.6 years ago

biostart ▴ 370

Dear statisticians:

Suppose I have two cell conditions, and the processed RNA-seq data for both of them, giving expression log2 fold changes.

I want to test a hypothesis that a subgroup of genes is upregulated statistically stronger than all the genes. So, I am creating a column with log2 fold changes of all genes whose changes are statistically significant (say, 10k genes), and a second column with log2 fold changes of the genes belonging to my subgroup of interest, whose changes are statistically significant (say, 1k genes). Then I calculate two-sample t test for these two groups of genes. And I get a P value which is quite low.

Please comment, whether I am doing it right?

Thanks

PS. Someone suggested that I should be using Mann-Whitney Test instead of the two-sample t test. Could statisticians please comment on this?

RNA-Seq • 3.1k views

ADD COMMENT • link updated 7.6 years ago by h.mon 35k • written 7.6 years ago by biostart ▴ 370

0

Entering edit mode

Did you use the same background to find significant genes in both sets, are the two sets independent?

ADD REPLY • link 7.6 years ago by H.Hasani ▴ 990

0

Entering edit mode

yes, the significance was determined once for all the genes using the standard workflow, I am not changing it when splitting genes into subsets

ADD REPLY • link 7.6 years ago by biostart ▴ 370

0

Entering edit mode

I'm not sure in this case if it is correct to use the t-test (see Independent t-test using SPSS Statistics). If you are interested in addressing the strength of a statistical signal (the fold changes) you could use volcano plot.

hth

ADD REPLY • link 7.6 years ago by H.Hasani ▴ 990

0

Entering edit mode

I was not using independent t-test, I was using two sample t-test

ADD REPLY • link 7.6 years ago by biostart ▴ 370

score 0 · Answer 1 · 2016-10-03

0

Entering edit mode

7.6 years ago

h.mon 35k

It sounds like you want to perform a Gene Set Enrichment Analysis, the R/Bioconductor gage has all the framework you need, see its page here and this manual on how to prepare custom gene sets.

ADD COMMENT • link 7.6 years ago by h.mon 35k

0

Entering edit mode

Thank you, but I would prefer to not use any blackbox solutions, I want to understand what I am doing

ADD REPLY • link 7.6 years ago by biostart ▴ 370

2

Entering edit mode

The documentation and publication belonging to those tools are quite clear about what's going on, it's not a blackbox solution.

There is a difference between understanding what you are doing and reinventing the wheel.

ADD REPLY • link 7.6 years ago by WouterDeCoster 47k