Query coverage in metagenomic gene family annotation
0
0
Entering edit mode
7.9 years ago
Moose • 0

Dear all,

I am working with metagenomic gene abundance tables derived from the MetaHIT 3.9 million catalog and want to work with genes annotated to KOs in KEGG. The data are provided here and the KEGG annotation is provided as the file "KEGG annotation of the gene catalog."

The annotation file provides gene name, KEGG hit, bit score, fraction of KEGG hit covered, and then the KOs/Pathways/Modules annotated to that gene.

I cannot find the specifics of the annotation in a publication, but it appears that the authors used BLASTP with a gene successfully annotated to a KO if it had a bit score greater than 60 (the minimum in the file is 60.1).

However, the fraction of hit covered ranges from near 0 to 1 i.e. 0 to 100%.

Is it common practice to filter further using the % coverage? Is there precedent or biological meaning to choose a specific threshold - 30%, 50% 90%?

Many thanks in advance.

metagenomics kegg threshold • 1.3k views
ADD COMMENT

Login before adding your answer.

Traffic: 2372 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6