Protein search against a database and then cluster
1
0
Entering edit mode
13 months ago
bobo • 0

Hi all,

Iv got a set of amino acid sequences in fasta format (8 sequences). I want to do a similarity search against a database like ncbi nr. Extract all AA sequences that match to my query sequences, to obtain a list of all the proteins present in the database. Then cluster them using a clustering tool to generate something like a minimum spanning tree to show relation between the various extracted AA sequences.

Any help in what tools to use.

also would just downloading all proteins with the same name within ncbi nr and then clustering them skipping the similarly search step also work?

Many thanks

clustering amino-acid search similarity • 482 views
ADD COMMENT
0
Entering edit mode
13 months ago
Mensur Dlakic ★ 27k

It depends on whether your tools and databases are installed locally, or you rely on web servers.

Running an HHpred search will automatically collect the homologs, and the resulting alignment can be downloaded by clicking on Query MSA -> Download full A3M. If you run hhblits for several iterations locally, the result will be similar. MMseqs2 from the same authors will create clusters. You will have to work on your own a bit after that to get a tree.

Searching by similar names would not work if your goal is to comprehensively identify homologs. Similar proteins are not always named the same way, and some may not have any annotations.

ADD COMMENT

Login before adding your answer.

Traffic: 1416 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6