How to decide what BLAST settings to use when searching for functional genes in a metagenome
1
0
Entering edit mode
3.9 years ago
crystal4 • 0

I have several lists of ORFs from metagenomic samples. I'm looking for specific genes by BLASTing the ORFs against databases of genes with known functions (for example, a database of nirK genes). I am having trouble figuring what values I should use for BLAST parameters such as identity, coverage, and word size. I know there probably isn't an exact answer, but are there any guidelines or papers dealing with this topic?

metagenomics blast sequencing microbial • 669 views
ADD COMMENT
1
Entering edit mode
3.9 years ago
Mensur Dlakic ★ 27k

It depends on what you are trying to achieve. Generally speaking, E-value will be more informative than the parameters you listed. If you are looking for orthologs, E <= 1e^-20 is probably a good starting point, along with high coverage. If you want to identify paralogs, or in general more distantly related proteins, E <= 1e^-5 should be a good cutoff. In the latter case you are not necessarily expecting either high identity or high coverage.

ADD COMMENT

Login before adding your answer.

Traffic: 2946 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6