Question: How to do Differential Expression for developmental series for normalised values?
0
gravatar for BehMah
14 months ago by
BehMah30
BehMah30 wrote:

Hi I have NORMALISED expression values (RPM) for developmental series and want to do differential expression. Does any one know how to do it. Thanks

rna-seq R • 437 views
ADD COMMENTlink modified 14 months ago • written 14 months ago by BehMah30
1

Can you elaborate a little more, such as informing us of the programs that you have used so far in order to produce your normalised expression counts? Also, sample size, number of developmental stages?

If you have >=3 development stages, then you can use, for example, a simple ANOVA (or non-parametric Kruskall-Wallis) in order to derive p-values. For pairwise comparisons, you could use a Wilcoxon Signed Rank test due to the fact that the samples are from the same cell but just at different stages of differentiation.

Knowing the specific of your analysis so far will help to decide the best strategy going forward, though.

ADD REPLYlink written 14 months ago by Kevin Blighe41k

Thanks Kevin for your time and answer. Following tophat, I normalised reads by depth of library to obtain RPM for each sample. In ANOVA, does it give pvalue per gene ? Also if compare pairwise using Wilcoxon works then why doing ANOVA then ?

ADD REPLYlink modified 14 months ago • written 14 months ago by BehMah30
1

So, you are using Cufflinks? Does Cufflinks not have its own in-built statistical tests? TopHat / Cuffinks is also very out of date. The upgraded versions are HISAT2 and StringTie.

The idea of ANOVA and a also a paired test between each pairwise condition is that they show different things: ANOVA shows differences between all conditions; a pairwise test shows differences between just 2 conditions at a time. So, these are not quite the same thing.

Assuming that you are using TopHat / Cufflinks, though, I recommend, first, that you upgrade to HISAT2 / StringTie, and then I also recommend that you use the statistical tests built into these programs in order to derive P values.

ADD REPLYlink written 14 months ago by Kevin Blighe41k

My pipeline is a bit different from normal way because Im doing DE for some non-coding RNA that are different from normal DE pipelines so I need to do statistics in my own way as a result can't use cufflinks/diff or DEseq/edgeR.

Does ANOVA give Pvalue per gene(row)? or does it just give overall pvalue for significance between groups?

If I do pairwise comaprision by t-test or Wilcoxon, would I need to run ANOVA before that?

ADD REPLYlink written 14 months ago by BehMah30
1

The ANOVA should return a separate P value for each gene, which is a measure of how the gene's variance differs across all conditions. The Wilcoxon test should also return a single P value for each gene. It depends on how exactly you implement these, though.

Are you using R, SPSS, Prism, STATA, or something else?

By the way, you should check the distribution of your data before running these tests, like via a histogram. That said, if you want to err on the side of caution, then make sure to us a Kruskal-Wallis ANOVA (non-parametric). The Wilcoxon test is non-parametric too.

ADD REPLYlink written 14 months ago by Kevin Blighe41k

Thanks again Kevin.

I am using R for all the stats. I think to return pvalue foe each gene I should change aov(gene.ex ~ group, data = my_data) a bit.

ADD REPLYlink written 14 months ago by BehMah30
1

I think that you should have a vector or column in a data-frame that indicates the sample grouping, and then you will have to perform the test for each gene, looping over the data-frame. For example

df
            group    Gene1 Gene2 Gene3 Gene4
    Sam1    Control  6     6     8     5
    Sam2    Control  5     6     3     4
    Sam3    Disease  9     8     2     4
    Sam4    Disease  8     6     7     5



test <- aov(Gene1 ~ group, data=df)

summary(test)
ADD REPLYlink modified 14 months ago • written 14 months ago by Kevin Blighe41k
1

Also note that the non-parametric ANOVA, Kruskal-Wallis test, may be more appropriate given your data (if non-normal distribution or low sample numbers)

ADD REPLYlink written 14 months ago by Kevin Blighe41k
1

Really appreciate your time and advices.

ADD REPLYlink written 14 months ago by BehMah30
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1456 users visited in the last hour