Question: Processing gene expression data
gravatar for Natasha
4 weeks ago by
Natasha30 wrote:

This is a follow up to my previous question.

I would like to implement the following steps given in the supplementary file of this study to reproduce the figure 1 displayed in the paper.

We used Affymetrix microarray data from a recent thorough analysis of the mouse and human transcriptomes [1]. We selected all 54 adult mouse non-cancer samples. The raw intensity data were transformed to normalized expression levels with the robust multi-array average (RMA) lowlevel algorithm [2] implemented in the BioConductor package [3]. We used standard settings, including perfect match (PM) only, model-based background and quantile normalization across experiments [4]. Similar results were obtained using the microarray analysis suite (MAS5) function followed by log-transformation to calculate expression levels (data not shown).

Mouse data is available on GEO with access ion number GSE1133. The data is available in different formats like CDF, CIF, GIN, PSI, SIF, PROBE, TAB, TXT. I am not sure which data format, containing the raw intensity data, has to be downloaded for implementing the procedure described above.

gene-expression • 136 views
ADD COMMENTlink modified 4 weeks ago by c.chakraborty160 • written 4 weeks ago by Natasha30
gravatar for ATpoint
4 weeks ago by
ATpoint25k wrote:

It is the CEL files under GSE1133 at the bottom of the page. Under GSE1133_RAW.tar press Custom to make a selection for the samples you need. CEL files store the raw intensity values that can be processed with standard software for normalization. Please search around on how to read and handle CEL files.

ADD COMMENTlink modified 4 weeks ago • written 4 weeks ago by ATpoint25k
gravatar for c.chakraborty
4 weeks ago by
c.chakraborty160 wrote:

Isn't there access to raw .CEL files for you to work on? Plus which paper, could you please share the link or doi.! I checked and there are .CEL files available for microarray analysis. If you want to analyse microarray data for gene expression analysis using, you should use the .CEL files. They are TARzipped in the supplementary files section.

ADD COMMENTlink written 4 weeks ago by c.chakraborty160

Many thanks for the response. Yes, the raw CEL files are available here. The figure 1 that I want to reproduce is available in this article. (Please find the link here) . Description of how the figure was created can be found in the supplementary. Also, figure one has been created using the data available from this study (Please find the link here).

In total 438 GSM files are listed . I am not sure how to distinguish Human and Mouse samples( I think this can be filtered using the platform id) ; cancerous and normal samples. Any suggestion on which package has to be used for RMA normalization illustrated here will be really helpful.

ADD REPLYlink modified 4 weeks ago • written 4 weeks ago by Natasha30

I think everything prefixed MGM is mouse, and the rest 1B/ 3A is human. Simply click the GSM... links, it will tell you the organism. Check if this pattern I suggested above holds true for the majority of the samples.

ADD REPLYlink modified 4 weeks ago • written 4 weeks ago by ATpoint25k

Thank you. It is mentioned that GPL1073 GNF1M platform is for Mouse (GSM18584 to 18705) GPL1074 GNF1H is for Human. (18706 to 18863)

However, I couldn't find the platform id in the CEL files .

ADD REPLYlink written 4 weeks ago by Natasha30

Go to the supplementary GSE1133_RAW.tar. Click on custom and it will lead you to all the .CEL files in this dataset. You can download whichever you need for your analysis.

ADD REPLYlink written 4 weeks ago by c.chakraborty160

Thank you. I am trying to normalize using the following code

library(affy) %IN bioconductor package
Data <- ReadAffy() % reads all .CEL files
eset <- rma(Data) % RMA normalization

Is this right? I am trying to normalize all samples(i.e GSM18584 to 18705) together

ADD REPLYlink written 4 weeks ago by Natasha30
Please log in to add an answer.


Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1727 users visited in the last hour