Close homologs search
2
0
Entering edit mode
2.9 years ago

I have a group of bacterial proteins that are very close to each other (this means that they are in a big group, but form a subgroup with distinctive features) and share some biochemical properties and I have to find homologous proteins from thermophiles that belong to this group and have the same properties as well. Is it good to just "BLAST" every protein against the database, find proteins from thermophiles, and build a tree to find proteins that belong to this subgroup? I've also tried to make PSSM from proteins that I have and to search with PSI-BLAST using this PSSM instead of the classic matrix.

search BLAST homologs • 1.1k views
ADD COMMENT
1
Entering edit mode
2.9 years ago
Mensur Dlakic ★ 27k

You have a contradiction between your title and the subsequent explanation, so it is not clear what you want. If you want "close homologs" only, plain BLASTP will most likely be sufficient. PSSMs are explicitly meant for diversifying the profile and iterating the search to find distant homologs. Unless your aim is to find ALL homologs - doesn't seem to be that way - I think you will do just fine with BLASTP because it most definitely will be able to find close homologs, and even many that are not so close.

Separately, I am curious how will you decide what are thermophiles in your search results? Are you really able to look at species name and know whether it is a thermophile?

ADD COMMENT
0
Entering edit mode

Thanks for your answer. To find proteins from thermophiles one can just check out the literature about a source organism and find the optimal growth temperature or if it's an unculturable organism to check out conditions of isolation place (e.g. hot spring).

About PSSMs: for example, I've made PSSM from proteins that belong to my group, as far as I know, this PSSM carry information about the level of conservation of amino acids at every position for this group, including information about the high level of conservation of certain amino acids which can potentially give these specific properties to my group. Will it help to slightly expand the search area and to find less obvious homologs that might not share extensive similarity but that have these conserved residues and get a higher score for them? I am really not sure about this, this mad idea came to my mind when I failed to find homologs with such criteria by plain BLASTP.

ADD REPLY
0
Entering edit mode

To find proteins from thermophiles one can just check out the literature about a source organism and find the optimal growth temperature or if it's an unculturable organism to check out conditions of isolation place (e.g. hot spring).

I could tell that you have not done this before, because what you describe above is simple only in theory - and it would still take a long time for a large number of proteins. In practice you will find out that many of your hits will be poorly documented, and it will take a major investment of time to figure out whether they are thermophilic or not.

In the second part of your question you are now asking to find less obvious homologs, which is not what is indicated by your title. Yes, it is doable the way you described, but there are many things that can go wrong unless you know exactly what you are doing. I would simply do multiple PSI-BLAST runs from several independent single sequences and let the program handle everything automatically. Or if you really want to build a profile from your alignment, I suggest you make a hidden Markov model (HMM) using HMMer. It will allow you to subsequently search the database for homologs. Inside of that package there is a program called jackhmmer which will do conceptually the same thing as PSI-BLAST, but it should be more sensitive.

ADD REPLY
1
Entering edit mode
2.9 years ago

Blast is certainly a good start (and the basis for all these kind of analyses), so you can't go wrong there.

However, blast will not be able to deal with more complex situations. So I would personally take advantage of running some gene-clustering/Gene-family software (eg. OrthoFinder) on the blast output. This does not take much more effort or time and will give you more robust results. (Bonus:: it will provide a much more easy to interpret result in the end compared to processing the blast output straightaway)

ADD COMMENT

Login before adding your answer.

Traffic: 2840 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6