Hi everyone,
in the scope of my research project I am analysing faecal gut microbiome samples for their abundances of lactobacillaceae species. For this purpose I ran Kraken2 over the samples, and consequentially Bracken (on species level) to estimate the relative abundances.
When parsing the files I did, however, realise that compared to the Kraken2 report, the Bracken report contained way less species and cut out strains all together. When re-running Bracken on strain level, lots of of species were missing in the report, and to confirm the hypothesis I ran Bracken on genus level and got different results again.
Furthermore, when running mixed linear models with all three kinds of report files I found very different results, basically indicating that for each individual level (genus, species, strain), associations were highest for the group that was analysed for.
My question now is: Do I have to look at the strain level bracken files when wanting to associate strains, species level bracken files when wanting species associations and genus files for genus associations,
OR do I use the smallest possible level (e.g. strain) and look at genus and species associations on strain level abundances?
thank you for your help.
Best, Rapha