Hi, I'm new to the field of environmental metagenomics and I want to do some functional annotation for my shotgun metagenome data. From the literature, I came across two types of pipeline for metagenomics function annotation: reads-based v.s. config-based. But I could not determine which one to use with both pros and cons. For reads-based pipeline, the blast done directly on the clean reads, it means more data maybe. But the reads are short, like mostly 150bp, less than many length of target genes. If it will influence the final blast accuracy or efficiency?
For contig-based pipeline, after assembly, there might be a big loss that many sequences could not be assembled for natural samples. If I use it, how to evaluate the assemble result? like CheckM? what level of evaluation standard is reasonable?
I would really appreciate all of you and responses. Sorry if it is a naive question. Thank you!
Thank you so much for your kind and time! That's really helpful! I will try the config-based approach first and search information about single-copy orthologous for the MAGs, that's new to me. :) Best wishes!