Hi! I'm a beginner in bioinformatics and trying to replicate the result from a paper named TAZ Expression as a Prognostic Indicator in Colorectal Cancer (https://www.researchgate.net/publication/235393359_TAZ_Expression_as_a_Prognostic_Indicator_in_Colorectal_Cancer)
Currently, I'm working with GSE14333 from GEO dataset.
To make Figure 1, I searched for the genes named "Axl", "WWTR1", "YAP1" and "CTGF" from each of their entrez id in data@featureData@data$ENTREZ_GENE_ID. I've obtained several genes (a row in the expression matrix) matching with the same entrez gene id. For e.g.
ID // GB_ACC // ... // Gene Symbol
213342_at // AI745185 // ... // YAP1
224894_at // BF247906 // ... // YAP1
224895_at // AA557632 // ... // YAP1
YAP1 matched with 3 rows, WWTR1 with 3 rows, AXL with 2 rows, and CTGF with 1 row.
It seems like each row for YAP1 is somehow distinct and each of them has different expression level in the expression matrix. Then how can I make the scatter plot above? Should I pick only one if there are multiple rows? Or can I just take the average expression level of all of them?
I hope this Target Description help identifying each of them in the case of YAP1.
 "gb:AI745185 /DB_XREF=gi:5113473 /DB_XREF=wg10a05.x1 /CLONE=IMAGE:2364656 /FEA=FLmRNA /CNT=46 /TID=Hs.8939.0 /TIER=Stack /STK=13 /UG=Hs.8939 /LL=10413 /UG_GENE=YAP65 /UG_TITLE=yes-associated protein 65 kDa /FL=gb:NM_006106.1"
 "gb:BF247906 /DB_XREF=gi:11163848 /DB_XREF=601858274F1 /CLONE=IMAGE:4068810 /FEA=EST /CNT=137 /TID=Hs.84520.0 /TIER=Stack /STK=51 /UG=Hs.84520 /UG_TITLE=ESTs"
 "gb:AA557632 /DB_XREF=gi:2328109 /DB_XREF=nl11g07.s1 /CLONE=IMAGE:1030044 /FEA=EST /CNT=137 /TID=Hs.84520.0 /TIER=Stack /STK=9 /UG=Hs.84520 /UG_TITLE=ESTs"
I'm stucked in here. Please give me a hand.