I am in the following dilemma: I am working in a lab where we shot-gun sequence a large number of microbiome samples. Often, there are new wet-lab practices tested or validated and I am tasked with performing statistical analysis to show if the new wet-lab method gives a significantly different result.
So, basically I have pairs of samples: one sequenced with the old method and one with the new one (both coming from the same biological source) and I want to compare them based on the bacterial relative abundances. For the sake of simplicity, let's say that the bioinformatic methods don't change, so we classify the bacteria always with the same pipeline. I feel a bit stuck on how to compare these sample pairs. I can calculate the various alpha- and beta diversity metrics, but they won't give a p value, unless I can compare them to some other reference.
So far I have two ideas:
Take samples coming from the same biological source, sequenced using the same method and calculate an average Bray-Curtis distance (or some other beta-diversity metric) between these samples (which should be a low value, since there shouldn't be any difference between these samples). Then for any new pair of samples, I can calculate Bray-Curtis between them and compare it to this average (e.g. using a simple T-test). If the result is non-significant, I can say that the new method didn't change the sample composition significantly
Order the bacteria in the samples based on relative abundances and use a correlation method to compare the samples. I am thinking of Kendall's tau (or maybe Kendall's tau-b, because I expect a lot of zeros)
Do these ideas make sense? If not, what else can I do?
That's the problem, they (my supervisor) would like to avoid using replicates. They just want to compare individual samples and I am having a hard time communicating that it's not something you do when working with microbiome data.
Their ideal scenario: Take two stool samples from the same person at the same time. Sequence one using the old, trusted method and the other with some improvement. Analyze both samples with the same bioinformatical pipeline, do some statistical magic and tell if the two samples are significantly different or not.