Hello,
While it might be a naive question, but I really need your help if possible. I want to predict whether some type strains (like Methanofollis liminatans GKZPZ) has a certain type of enzyme (anaerobic carbon-monoxide dehydrogenase)
There are several aspects I considered:
Blast the certain sequence in another type strain from a different but more studied genus. Maybe that's more determined.
Take all the sequences in the NCBI gene/protein subdatabase encoding this enzyme I can find and do redundancy by CD-HIT, just like to create a local customized database. Then, blast them with my strain genome.
Are these two methods feasible, which one is better? Or another common method?
Another question: Is there any way to predict if there is a certain enzyme in the type strain that has the same function as the enzyme I want?
I would really appreciate it if you could give me some advice. Thanks so much.
If you are looking for a specific enzyme and have a related type strain then it should be possible to use #1 above since it would likely be the quickest method. Since protein searches are likely to be move sensitive it may be best to use protein query for the enzyme.
You can predict based on sequence similarity/presence of right domain etc but to conclusively prove that prediction will require experimental verification.