Can we fix the identity percentage in blast+ stand alone software ?
0
0
Entering edit mode
6.3 years ago

Dear Friends, I have punch of proteins in a particular family from different species for eg. Species 1: 240, Species: 100 and Species3: 300. I want to do comparative analysis between species. Before that I must remove reduntant or overlapping proteins with in the species. I have already selected the proteins based on HMMER from the repective species. I need to screen more to get the actual number of proteins present in the each species of particular family. For that I am using BLAST+ to blast each species itself and remove the sequences having more than 70% identity by considering the sequences have more than 70% as one. So that I keep one sequence and remove all others in the one group. Like this I can trim protein seqeunces. For eg., Species 1: 240 into some around 70 or 80 depends on its similarities. I am not sure whether this idea is correct or not ...? If the idea is correct, is it possible to fix the percentage identity in the blast+ using -perc_identity 70. I have tried it but it shows "Unknown argument: "perc_identity"". Could anyone suggest me to get ride of similar or repeated sequence in the family.

blast alignment • 1.6k views
ADD COMMENT
1
Entering edit mode

Consider an alternative approach by collapsing the redundant sequences using the CD-HIT suite of tools.

ADD REPLY
0
Entering edit mode

If -perc_identity is not recognized you most likely have made a mistake when typing the command line or you're using a program that doesn't support this option. Have you also looked at the qcov_hsp_perc option ? Also blast may not be the best tool for this if you're considering the whole sequence because blast is a local alignment algorithm.

ADD REPLY

Login before adding your answer.

Traffic: 1917 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6