Question

Repeated Measure ANOVA, GLM,GEE, Linear mixed model, Generalized linear mixed model?

0

Entering edit mode

4.2 years ago

star ▴ 350

I have a table like below (it is a small subset of my data. In this table, I measured one variable over 4 different time points (T1,..., T4), now I would like to check is there any significant difference between time points for each sample? Then based on that I will select those samples that have variability in different time points.

My assumption for the data is:

non-normal distribution.
unequal variance.
the same sample size for each dependent group.

I have reviewed several methods (like, Repeated Measure ANOVA, GLM, GEE, linear mixed model, Kruskal Wallis test and GLMM), but I am confused about which one is more appropriate for my data?

 sample                                T1                       T2                        T3                       T4

1:824850-825300                 0.00000000                0.0000000                0.0000000                0.0000000        
1:894445-894831                 5.39848590                3.9919398                5.8171244                3.4732853         
1:902180-902369                 5.30856403                4.7035677                1.6972109                4.0094193
1:911400-911969                 3.93351892                8.6449756                3.9462391                5.9417675
1:912000-912125                 3.08713416                3.7929570                0.5132366                2.7979578
1:919425-920025                 4.37344006                6.4203699                3.5285015                3.4974473
1:934044-934294                 9.87882930               11.3788710                7.4419304                6.0622420
1:948960-949100                 1.65382187               11.0063484                5.4989633               12.4908832

GLM lme statistic ANOVA • 1.5k views

ADD COMMENT • link updated 4.2 years ago by ATpoint 82k • written 4.2 years ago by star ▴ 350

0

Entering edit mode

Why not using any of the established statistical frameworks such as edgeR or DESeq2 and a LRT test? Is this count data, because this would be a requirement, raw counts to be precise. If not, limma could be an option. Please add details what these data are.

ADD REPLY • link 4.2 years ago by ATpoint 82k

0

Entering edit mode

Thank for the reply! Yes, it is normalized data (based on TPM) from different histone marks and from different time points. But here I only showed one histone mark.

is there any way to do comparisons for each row over different time points?

ADD REPLY • link 4.2 years ago by star ▴ 350

score 2 · Accepted Answer · 2020-02-07

2

Entering edit mode

4.2 years ago

ATpoint 82k

This comes down to differential ChIP-seq analysis. I suggest you check the csaw package or the DiffBind package which covers differential ChIP-seq analysis. They are based on optionally either edgeR or DESeq2. Do not do homebrew statistics, there is expert software for this. As said a LRT test could be used to find any differences, or you make different contrasts like T2-T1, T3-T2 etc. The manuals are quite comprehensive with lots of example code. Be sure to start with raw counts, not RPKM or TPM. None of this is suitable for meaningful analysis. There is a lot of material online that explains why. Raw counts is the input for these tools.

ADD COMMENT • link 4.2 years ago by ATpoint 82k

0

Entering edit mode

Thank for the reply! Do you mean that I consider each time point as an independent sample and apply comparison? but does it work when I don't have any replicate?

ADD REPLY • link 4.2 years ago by star ▴ 350

1

Entering edit mode

Yes that is what I meant, but that would only work with replicates. Unreplicated data will not work with the tools I suggested. Is it bad practice to do unreplicated experiments since these are not reliable and you cannot assess reproducibility and dispersion between replicates. There is a lot of material online on how to deal with unreplicated data. What you could do is e.g. to use T3 and T4 as group "late_time_points" and T1/2 as "early_time_points" to have a biological replicate and capture the most obvious time effects for early/late. If you really want to see effects per time point you have to repeat the experiments with replicates, n=3 is recommended as a minimum.

ADD REPLY • link 4.2 years ago by ATpoint 82k

1

Entering edit mode

This is a great point. The only tool I know of that doesn't require replicates is MAnorm, but csaw or DiffBind are likely much more robust and you should definitely acquire replicates if possible.

ADD REPLY • link 4.2 years ago by jared.andrews07 ★ 16k