Hi all,
Iv got a set of amino acid sequences in fasta format (8 sequences). I want to do a similarity search against a database like ncbi nr. Extract all AA sequences that match to my query sequences, to obtain a list of all the proteins present in the database. Then cluster them using a clustering tool to generate something like a minimum spanning tree to show relation between the various extracted AA sequences.
Any help in what tools to use.
also would just downloading all proteins with the same name within ncbi nr and then clustering them skipping the similarly search step also work?
Many thanks