Question: Subgroup significance testing
gravatar for pavenhuizen
4.7 years ago by
pavenhuizen70 wrote:

Dear all,

I would like to ask your assistance with the following problem: I have a subgroup of genes which contain a certain motive and I would like to know if the presence of this motive significantly changes the expression of said genes. I have RNA-Seq data for a control (WT), and two over-expression and mutant lines.

So far I have came up with one approach, which I will outline below, but I'm uncertain if this is the correct approach and I would like to know if there are any other methods of finding the significance of my subgroup.

My planned approach is as follows:

  1. Obtain the significantly differentially expressed genes with edgeR, by comparing WT with the over-expression and mutant conditions.
  2. Divide the genes into three categories, based on the edgeR output. The categories are either +1, if a gene is significantly differentially expressed AND up-regulated, -1 if significantly differentially expressed AND down-regulated and 0 if not significant.
  3. Perform Chi-square analysis based on the categorized data, comparing the frequencies/percentages of the subgroup with the frequencies/percentages of all the genes (including those of the subgroup).
  4. Do bootstrapping analysis with replacement and get the one-sided p-value.

And that's about it. I don't have much experience in this kind of analysis and my statistics are not that strong, so please correct me if I made any mistakes or if you know of a better way of testing!

Thanks in advance for everyone taking the time to read this and to anyone who is willing/capable of helping me with my problem.

statistics rna-seq • 1.6k views
ADD COMMENTlink modified 4.7 years ago by ss9674520 • written 4.7 years ago by pavenhuizen70
gravatar for Benn
4.7 years ago by
Benn8.0k wrote:

I would suggest a hyper geometric test, such as done for gene set enrichment analysis (with GO terms).

ADD COMMENTlink written 4.7 years ago by Benn8.0k

Thank you for your quick response! Is it possible to do a hyper geometric test for two conditions, or should I do a two hyper geometric tests for each sample, one for only up-regulated genes and for down-regulated genes? Or alternatively just comparing the number of significantly differentially expressed genes against the others?

--- EDIT ---

I will try hyper geometric testing, but I was wondering if there is anything wrong with my proposed approach?

ADD REPLYlink modified 4.7 years ago • written 4.7 years ago by pavenhuizen70

You can do both ways. But the general approach is comparing number of differentially expressed genes.

ADD REPLYlink written 4.7 years ago by Naresh D J80

I don't see the point of using the chi-square test.

As you describe your experimental design, you want to know if your up or down regulated genes are significantly enriched with the motif (or up/down together). You can test these 3 cases separately with hyper geometric test.

So step 1 of your approach sounds good, then try the hyper geometric test. Good luck!

ADD REPLYlink written 4.7 years ago by Benn8.0k

Thank you! I'm working on it now

ADD REPLYlink written 4.7 years ago by pavenhuizen70
Please log in to add an answer.


Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1356 users visited in the last hour