Hi everyone, I've started learning about shotgun metagenomics sequencing. My understanding is there are two main approaches: mapping-based and assembly-based. In the former, you simply map reads to a database to determine classification and perform some type of functional profiling. In the latter, you assemble the reads first into genomes and then do classification and functional profiling.
What's unclear to me is, in the assembly approach, you might (given enough reads) be able to assemble a full genome of a hitherto unidentified species. But since this new species is not in the database, you wouldn't know what it is anyways. However, you could compare the sequences to see what it's similar to - but you can do that using the unassembled reads anyways.
So given the large computational resources required to assemble genomes, what additional value can be gained? Having an assembled genome is a 'nice to have' but if the goal is simply classification and functional profiling, then wouldn't a mapping-based approach be sufficient?
If the assembly is used to perform some sort of comparative genomics study, then that would be understandable. But how many shotgun metagenomics studies actually have a larger comparative genomics component? Not that many from my survey of the literature. Even if there is, the assembled genomes probably only make up a small fraction of all the genomes in the sample, so it's hardly a representative sample.
I'm a newbie to this field, so please tell me if my reasoning is wrong!