Hello all!
I have a shotgun dataset, from which I was able to get some bins with different completeness and contamination rates. Now, I want to calculate Average Nucleotide Identity to see if our MAGs are new species. I am planning to use FastANI. I need reference genomes that will match the MAGs I have.
Most of our bins are assigned (GTDB) to genus/family level, with just a few assigned to species.
I have a question: for example, I have one bin that was assigned to d__Bacteria;p__Verrucomicrobiota;c__Verrucomicrobiia;o__Chthoniobacterales;f__UBA10450;g__AV40;s__ ... completeness 100, although it has only 23 contigs.
What I think I need to do is to download all species from the g__AV40 genus, which is available on the GTDB website, but they all have different completeness and it’s not reference data — I mean those bacterial genomes were not from isolates but rather MAGs, and there are no complete genomes available on NCBI if I use the original name of the genus.
So the issue is: I can get those genomes (from GTDB), but they are not really reference genomes.
What should I do in this situation?
Thanks, Best, Alla
Thank you. I was just wondering if there are any methods that can be applied. I do understand that metagenomics has many challenges and unresolved issues.