Question

Arrayexpress processed data

1

Entering edit mode

6.0 years ago

alessandro.palma ▴ 30

Hi, I have a question on how to identify transcript names in processed data from Array Express. I saved the processed data from this study: (E-GEOD-24395) and I got some files each one with three columns (reporter identifier, expression value and p-value).

Is there any way to find the transcript name (something similar to gene symbols) corresponding to the reporter identifiers? I searched for some of these reporter identifiers into all files provided within the study (simply opening each file and then doing ctrl+F) but I couldn't fine them anywhere. So, the problem is that I don't really know what these "Reporter identifiers" are...

I also noticed that the file named "A-MEXP-930.adf.txt" (where I can get the hugo IDs) is exactly the same length as the processed files (48,701 rows after removing the header), so I guess I could combine in a data frame the hugo IDs extracted from this file with the expression values for each sample from the other files, but I am not sure about the correspondence between hugo IDs and the reporter identifiers (the listed hugo IDs could have been sorted or manipulated someway, and there are also some blank values when I read them as a table in R).

Any help? Thanks

next-gen annotation beadChip ArrayExpress R • 1.5k views

ADD COMMENT • link 6.0 years ago by alessandro.palma ▴ 30

0

Entering edit mode

Thank you! Actually I verified my previous hypothesis (the A-MEXP-930.adf.txt file was exactly the same as the table provided by GEO). But I think downloading the table you suggested is better and simpler to do.

ADD REPLY • link 6.0 years ago by alessandro.palma ▴ 30

0

Entering edit mode

Please use ADD COMMENT/ADD REPLY when responding to existing posts to keep threads logically organized.

ADD REPLY • link 6.0 years ago by GenoMax 141k

score 0 · Answer 1 · 2018-04-19

Hi you can go to the platform which the job is done with, in your case it is GPL6106 (do this on NCBI by searching one of the source name for instance GSM601376 and find the platform). Download a table containing all information about probes, open it in an excel file and easily use V-lookup formula to place all corresponding gene symbols (or other information) for each reporter identifier in front of that. Actually, in here reporter identifier is the ID. Good Luck