I would want to correlate protein expression and mRNA expression in my breast cancer research. I downloaded L4 level RPPA data from the TCPA portal: https://tcpaportal.org/tcpa/download.html, and got a protein expression matrix which is great. However, I was baffled by the protein names from this file. For example, some names look like these: "X1433EPSILON", "EGFR", "EGFR_pY1068", "ERALPHA".
My questions are what these protein names are? Is the first one a legal protein name? What's the difference between the two EGFRs? Which one should I use for correlating with EGFR mRNA expression?
And how should I map them to gene symbols? I believe ERALPHA corresponds to the ESR1 gene. But which R library should I use for mapping this?
This is my first time working with RPPA data, and I didn't find much helpful information from the TCPA portal... Any suggestion is much appreciated!