Hi,
I am having following dataset (toy example):
                            ASV1         ASV2        ASV3              cluster
sample 1    timepoint1       435            439       39  ....         cluster1
sample1     timepoint7       845            32         0    ....       cluster2
sample 2    timepoint1       90             13         4      ....     cluster2
sample 2    timepoint7       45             0          48     ....     cluster2
sample 2    timepoint13      90            3           23  .....      cluster3
For 30 individuals (samples) I have thus genera abundances (ASV counts, 216 in total), and I want to do differential abundance testing across different clusters (defined myself based on PcoA visualization) . I am doing the following, with ps_final my phyloseq object. I want to see whether between the different clusters. But, now how to block on subject (sample_id) here? Also, for only 7 individuals we have a timepoint 13 available, not for the other ones. How do I deal with this? If I include sample_id then we get: Error in checkFullRank(modelMatrix) : the model matrix is not full rank, so the model cannot be fit as specified. One or more variables or interaction terms in the design formula are linear combinations of the others and must be removed.
Any help is greatly appreciated!
g_dds = phyloseq_to_deseq2(ps_final, ~ cluster+sample_id)
g_dds = DESeq(g_dds, test="Wald", fitType="parametric")
taxes <- as.data.frame(tax_table(ps_final))
res = results(g_dds, contrast=c("cluster", "Cluster1", "Cluster2"),cooksCutoff = FALSE)
alpha = 0.01
sigtab = res[which(res$padj < alpha), ]
sigtab = cbind(as(sigtab, "data.frame"), as(tax_table(ps_final)[rownames(sigtab), ], "matrix"))
head(sigtab)
taxes_to_sigtab <- taxes[match(rownames(sigtab), rownames(taxes)),]$Genus
rownames(sigtab) <- taxes_to_sigtab