I have a list of proteins with their uniprot IDs from an MS experiment (with no quantitative data). From this list, I should explore protein-protein interaction network using STRING database. However, there are some proteins that String could not identify ("Sorry, STRING found no proteins by this name in Homo sapiens"). I tried to find other possible IDs that these proteins may have in other databases such as gene cards, ensembl,... but still String could not find these proteins.
However, instead of finding these proteins, String identified some similar proteins and or paralogs of these proteins.
For example, I have these four proteins:
protein name in my dataset ---> String output Q5JXB2 (UBE2NL) ---> P61088 (UBE2N) P0CG22 (DHRS4L1) ---> Q9BTZ2 (DHRS4) Q5T1J5 (CHCHD2P9) ---> Q9Y6H1 (CHCHD2) P0C7P4 (UQCRFS1P1) ---> P47985 (UQCRFS1)
Now, I am highly confused that what I have to do in this situation, whether I should remove these four proteins from my dataset in order NOT to include them in the analysis (as String can not identify them and I have no choice), or keep them and accept String recognition of them as UBE2N, DHRS4, CHCHD2, UQCRFS1, or there are other ways to deal with this condition but I am not aware of.
I was wondering if you could help and guide me what is the best that I can do in this situation. Any advices and suggestions are highly appreciated.
Best wishes, Farah