Hello everyone,
I have a set of metagenome-assembled genomes (MAGs) with varying completeness levels, but I call them bins) The size and number of contigs in each bin differ significantly. For example, one bin has 100% completeness but contains only 23 contigs, while another bin has around 50% completeness but includes about 650 contigs.
Is it correct to understand that these MAGs are essentially collections of contigs, some of which may represent unknown genes? To assess how complete these MAGs are (most of our bins were classified up to the genus level only), should I calculate their average nucleotide identity (ANI) against reference genomes of the corresponding genus? But how can we be sure that downloaded genomes are fully complete? In many papers describing novel MAGs if you download those fasta files its just a sets of contigs - loks like bins to me. How can one be confident that such MAGs represent near-complete or high-quality genomes, given that they are fragmented into multiple contigs?
I would appreciate any insights or references on best practices for evaluating MAG completeness and quality beyond just completeness scores.
Thank you!