I want to download the human and other completely sequenced proteomes in order to search for homologs. A uniprot search results in ~136500 sequences in case of human:
Searching for a protein sequence among these sequences yields too many homologs in human which is impossible. CD-HIT filtering by 90% sequence identity does not not reduce the number of hits much. The reviewed ~20000 entries in case of human do not include all the human proteins. I am wondering if Ensembl would be a better choice.