this must be very basic, but still. I have a protein sequence for which I want to find homologs. I go to BLAST and do, for simplicity here, a regular BLASTp.
I know that blasting against refseq_protein or swissprot is common practice, but how about nr (non-redundant protein sequences)? This includes "All non-redundant GenBank CDS translations+PDB+SwissProt+PIR+PRF excluding environmental samples from WGS projects", and as far as I've seen, it includes not only hypothetical proteins, but also different instances of the same protein (e.g. different combinations of PDB chains, etc.)
Would you guys consider a BLAST search against nr a proper "finding-homologs" exercise?