What I am suggesting is not necessarily a complete pipeline, but it should get the job done.
It starts with metagenome binning, for which you can use MetaBAT, CONCOCT, VizBin, or one of more recent packages such as Vamb. The output may look something like this:
Next you assess the quality of those bins using CheckM, which may look like this:
Bin Id Marker lineage # genomes # markers # marker sets 0 1 2 3 4 5+ Completeness Contamination Strain heterogeneity
-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
group_001 k__Bacteria (UID203) 5449 104 58 6 81 17 0 0 0 94.36 5.80 5.88
group_003 k__Bacteria (UID203) 5449 104 58 4 12 46 42 0 0 93.97 7.19 19.77
group_015 k__Bacteria (UID2495) 2993 147 91 6 133 8 0 0 0 93.96 6.59 0.00
Next step is taxonomic assignment, which can be done using GTDB Toolkit an may look like this:
user_genome classification
group_001 d__Bacteria;p__Acidobacteriota;c__Aminicenantia;o__UBA2199;f__UBA2199;g__UBA2199;s__
group_003 d__Bacteria;p__Acidobacteriota;c__Aminicenantia;o__Aminicenantales;f__RBG-16-66-30;g__;s__
group_015 d__Bacteria;p__Proteobacteria;c__Gammaproteobacteria;o__Steroidobacterales;f__Steroidobacteraceae;g__RPQJ01;s__
Or if you just want to visualize the differences, you can bin the two groups together in the first step and separate them into two plots:
Thank you very much - much appreciated! I came across MetaWRAP, which seems to be doing all the things needed in (more or less) a one go: https://github.com/bxlab/metaWRAP
The t-SNE/PCA is a good idea though, definitely worth doing even if only for a pretty picture.