I am having a debate with my PI when interpreting the information provided by the OTU table. The question is, is the OTU table generated after sequence alignments with database provide accurate information for estimating the number of bacteria present in that sample? For example if we have OTU10 which has an OTU count of 2 and another OTU1 which has an OTU count of 1000 mean that only 2 bacterial cells of OTU10 are present in the environment or is it just mean that OTU10 is relatively a lot less abundant than OTU1? Or to get an accurate estimate of quantity, would qPCR be necessary?
For example if we have OTU10 which has an OTU count of 2 and another OTU1 which has an OTU count of 1000 mean that only 2 bacterial cells of OTU10 are present in the environment
It certainly does not mean that there are 2 bacterial cells belonging to OTU10 in the environment/sample. There is no simple relationship between the number of reads (counts) and the absolute number of bacteria in the sample. You should think of your sequenced reads more as a random sample from the population of bacteria. If you have 10,000 reads for a given sample, it is similar to surveying 10,000 randomly selected Canadians, say. It will never allow you to estimate the population of Canada.
or is it just mean that OTU10 is relatively a lot less abundant than OTU1?
This is much closer to the truth, but with certain warnings. Different types of bacteria may be unequally represented in your 16S data, even if they are present in equal proportions in the original sample. Two mechanisms for this are:
- Some bacteria have more copies of the 16S gene in their genomes than others, and will therefore contribute more 16S reads on average.
- Some bacteria may not have exact matches to the primers used in the PCR reaction, and will therefore contribute less to the sequenced reads.
These biases are discussed in another good paper by Robert Edgar (I recommend paying attention to his work):
In addition, I think there are many other sources of bias that will complicate the numerical relationship between your OTU counts (or rather their proportions, like 2/1000 in your example) and the true relative abundances of bacteria. But generally you will be comparing two groups of samples, and the same biases will be equally present in your two groups.
I think the problem lies within the understanding of the term "OTU". It has been frequently misunderstood and often considered a replacement for "species" in the bacterial community; and which of course is incorrect.
Keep in mind that these are rough estimates (that's why we call it "abundance estimation") of the bacterial diversity. Rest, I found the paper shared by genomax very interesting; may be I ll also go through it.