I noticed that the t-test degrees of freedom in limma (and F-test denominator degrees of freedom) are always the same regardless of which treatments are in a contrast. However, I have a lot of missing data for some specific features in some specific treatments in a phosphoproteomics dataset.
Is it statistically correct that a t or F-test involving only treatments that have few/no missing values suffer a loss in degrees of freedom because treatments _not in the contrast(s) being tested_ are missing?
This can lead to an unintuitive ordering of t and F-statistics for features in my data.
For example, I have a toy dataset where some features have many more missing values than others, but only in certain treatments:
When I do a t-test on treatments where the sample number is the same between features 1 and 2 (n = 3 for all the blue treatments), the degrees of freedom is still very different because of the other treatments where there are missing values:
# Specify model: means model, no intercept
design <- model.matrix(~0 + treatment, data = metadata)
# Fit model
fit <- lmFit(object = counts, design = design) # warning is that some treatments have all NA values
# Specify contrasts
# Choose contrasts where sample size for example features is the same
contr <- makeContrasts(
contrasts = c(
"treatmenttreatment2 - treatmenttreatment1",
"treatmenttreatment3 - treatmenttreatment1"
),
levels = colnames(coef(fit))
)
# Estimate contrasts
# Use empirical bayes moderation
fit_contrasts <- contrasts.fit(fit, contr)
fit_contrasts_ebayes <- eBayes(fit_contrasts, trend = TRUE)
# Investigate results for features 1 and 2
print(fit_contrasts_ebayes$df.residual[1:2])
print(topTable(fit_contrasts_ebayes, coef = 1, number = 2, sort = "none"))
This leads to the opposite order of significance compared to what I would expect (feature 2 more significant than feature 1, even though feature 1 logFC is greater):
> print(fit_contrasts_ebayes$df.residual[1:2])
[1] 36 58
> print(topTable(fit_contrasts_ebayes, coef = 1, number = 2, sort = "none"))
logFC AveExpr t P.Value adj.P.Val B
feature1 5.440525 15.84160 7.507528 4.740642e-09 2.370321e-07 10.49628
feature2 -2.776424 22.47795 -8.531085 5.767286e-12 5.767286e-10 16.92264
Thank you for the quick and thorough answer! Indeed I checked and the variance is different between my two features. However, I could imagine a scenario where the variance is not different, but the difference in missingness still alters the t-statistic via the degrees of freedom. I understand from your answer that this is expected, the difference in t-statistic would then reflect a different level of certainty in the same variance in that case, because there are more points with which to estimate the variance for one feature compared to another.
For my case I think I will restrict the question to your Case 2 to get an "intuitive" ranking of features for the contrast I care about and to ignore some of the heteroskedasticity in other treatments.
Edit: I answered too soon! Thanks Gordon for your answer as well and for pointing me towards limpa. What you write makes sense and I'll check it out.
Chopping up your data into lots of two group comparisons is almost never the right way to go. Almost always, it is better to analyse all the data at once. You haven't shown any evidence of heteroskedasticity between treatment groups for your real data but, if it does exist, then just tell limma to downweight groups according to their variability. In limpa, you do this by setting
var.group=treatment
when you rundpcDE()
.