Hi,
I am having following dataset (toy example):
ASV1 ASV2 ASV3 cluster
sample 1 timepoint1 435 439 39 .... cluster1
sample1 timepoint7 845 32 0 .... cluster2
sample 2 timepoint1 90 13 4 .... cluster2
sample 2 timepoint7 45 0 48 .... cluster2
sample 2 timepoint13 90 3 23 ..... cluster3
For 30 individuals (samples) I have thus genera abundances (ASV counts, 216 in total), and I want to do differential abundance testing across different clusters (defined myself based on PcoA visualization) . I am doing the following, with ps_final my phyloseq object. I want to see whether between the different clusters. But, now how to block on subject (sample_id) here? Also, for only 7 individuals we have a timepoint 13 available, not for the other ones. How do I deal with this? If I include sample_id then we get: Error in checkFullRank(modelMatrix) : the model matrix is not full rank, so the model cannot be fit as specified. One or more variables or interaction terms in the design formula are linear combinations of the others and must be removed.
Any help is greatly appreciated!
g_dds = phyloseq_to_deseq2(ps_final, ~ cluster+sample_id)
g_dds = DESeq(g_dds, test="Wald", fitType="parametric")
taxes <- as.data.frame(tax_table(ps_final))
res = results(g_dds, contrast=c("cluster", "Cluster1", "Cluster2"),cooksCutoff = FALSE)
alpha = 0.01
sigtab = res[which(res$padj < alpha), ]
sigtab = cbind(as(sigtab, "data.frame"), as(tax_table(ps_final)[rownames(sigtab), ], "matrix"))
head(sigtab)
taxes_to_sigtab <- taxes[match(rownames(sigtab), rownames(taxes)),]$Genus
rownames(sigtab) <- taxes_to_sigtab