Question: How to use the data from Affymetrix Human Exon 1.0 ST Array for microarray gene expression analysis?
Hello, I was Reading this article from the TCGA research group about multiform glioblastoma and there it is used microarray data of three platforms, one of them is Affymetrix Human Exon 1.0 ST Array, but this platform measure exons. So, I look for the supplementary data of the article in this website and I found that they transform the data ending with the expression of genes. Thus, my question is: How can be done that?

I'm very new in this stuff, so even when this is a silly question, the answer will help me a lot. Thanks in advance.

exon microarray tcga • 707 views
This GUI may be able to help you:

Charles Warden4.9k
Duarte, CA
In the website link, LBL202.txt is a gene-level summary of expression values. This would be most helpful if you wanted to get an idea of the expression level of the gene, compared to the other arrays which have probes that target different parts of the gene. It will also be more robust than the exon-level summarization.

If you have access to raw .CEL files for HuExon arrays, your question would be similar to this post:

Computing Expression From Affymetrix Exon Array Data

Thank you very much for your answer, but I recently found that the TCGA give you also the gene-level expression values (however, when I don't be working with TCGA I will need the information of the links that you and wrote). And now I have another problem:

I downloaded from TCGA with the Data Matrix option

  1. Select a disease: GBM - Glioblastoma multiforme
  2. Center/Platform: LBL (HuEx-1_0-st-v2)
  3. Batch Number: Batch 1
  4. Data Level: level 3

then, I opened the file lbl.gov_GBM.HuEx-1_0-st-v2.1.gene.txt, which has 29 signal measures. But when I open the file FILE_SAMPLE_MAP, it has only 25 samples mentioned for lbl.gov_GBM.HuEx-1_0-st-v2.1.gene.txt, what does it means? Could you explaime this?

That is a good question - I can see that you would need that file in order to map the expression values to patients, and the number of columns in the gene file is greater than the number of samples in the mapping file.

However, if you look at the .sdrf.txt file, there are mapping for all the samples (in the "Normalization Name" or "Hybridization Name"), even though that file contains way more than 30 rows. It also looks like the explanation is at least partially due to running control samples, while the mapping only lists patient IDs.

