Dear all, I am a rookie in data analysis and stuck with my results dnt know how to interpret them.
I started with 7 metagenomic assemblies of different species of Azolla fern. The aim was to identify bacteria in leaf ecosytem of azolla different species. Out hypotheisis was, if there are similar bacteria which repeat within the azollas different species, they will cluster together when their genomes will be plotted in dendrogram or a tree.
The method used
spades to get assemblies,
BWA was used to do backmapping,
samtools for sorting,
metabat for binning and checkm for to see completeness and contamination of bins.
prokka was used to annotate the genomes and uniport ids were obtained and table was made of all uniport id of all the bins. the table was changed to binary table and then used to create a dendrogram in R.
The dendrogram and then tree made by using dendrogram in fig tree. In the tree i observed that the bacteria are clustering according to the metagenomic sample or plant host not on the basis of their similar taxonomical name eg rhizobiales is clustering with burkholderiales of same metagenomic assembly but not with rhizobiales of other host plant assembly.
Im on the dead end how to intrepret these results and what can i deduce from it. and are there other ways to improve my approach? Can i compare similar taxonomical bins directly of different metagenomic assemblies any suggestions will be valuable.
manpy student utrecht university holland