If you Primer BLAST the traditional 515/806 microbiome primer pair against NCBI Reference Genomes, you will find hundreds of bacterial taxa that yield multiple paralogs of the 16s gene. For example: NZ_CP089309.1:3406801-3407055 Thiocapsa bogorovii strain BBS chromosome, complete genome NZ_CP089309.1:3716278-3716532 Thiocapsa bogorovii strain BBS chromosome, complete genome
Just looking quickly, I found one species with 11 paralogs of 16s: NZ_CP089997.1 Cytobacillus spongiae strain CY-G chromosome, complete genome
I'm confident that a thorough search would yield taxa with dozens of paralogs. Assuming that NCBI Reference Genomes are annotated correctly, this represents an obvious bias in PCR detection probability, which would affect the relative abundance of taxa for microbiome analysis. How does QIIME deal with this bias? Please tell me that I've overlooked something silly, because this seems like a big deal.
So, my observation is not new: https://journals.plos.org/ploscompbiol/article?id=10.1371/journal.pcbi.1002743 Geez. How are we this far in to microbiome work, and we don't have a real solution to this problem?