Here is the answer from UniProt:
You cannot find Ensembl gene IDs through our mapping services, for the list of proteins you provided, as there is no corresponding gene and protein in Ensembl.
Two possible reasons for that:
On one hand, there are still some regions of the human genome that are not properly resolved (see PMID: 35357919) in the current genome reference assembly used by Ensembl (GRCh38.p13). Ensembl is still missing some protein-coding genes.
On the other hand, UniProt has curated proteins based on mRNA, proteomics, and literature evidence but the existence of these proteins remains dubious. There might be no corresponding gene at all, and we may probably deprecate these entries in the future.
With Ensembl, we are working on resolving these discrepancies, but you can still find some as shown by your list.
There are other possibilities that can explain the absence of cross-references to Ensembl in human entries, the main ones being the following:
Ensembl as a predicted protein which is not identical to the one manually curated in the reviewed (Swiss-Prot) section of UniProtKB.
Ensembl has no predicted protein for the corresponding gene as they do not consider it as a protein-coding gene. There is an Ensembl gene/ENSG but no Ensembl peptide/ENSP.
In absence of a mapping to Ensembl and thereby, a mapping to the reference genome, the entries are momentarily moved to the 'unplaced' component of the proteome. When a mapping to Ensembl is added then these entries will be moved to the correct chromosome component in the following releases.
Please don't hesitate to contact the UniProt helpdesk with your list of unmapped identifiers, and we can investigate.
That is actually what I should've done first! Thank you for your suggestion, I will keep this thread open for others to chime in their thoughts!
@Elisabeth is with UniProt support based on prior postings.
Thanks, we have received your list at the helpdesk, and the ticket has been assigned to a curator.
@Elisabeth I actually have a follow-up question since looking further at those identifiers, I noticed that they are classified as 'Unplaced' under the Proteome category. Does this means that these proteins unmapped to the current reference genome?