Question: Clustering tissues using gene expression data
0
gravatar for Natasha
15 months ago by
Natasha40
Natasha40 wrote:

I would like reproduce the tissue cluster tree reported in figure 1 and figure 2 of this paper

In the supplementary document related with this paper, it is mentioned that the raw intensity data from "A gene atlas of the mouse and human protein-encoding transcriptomes" has been used. However, I couldn't find the raw intensity files on NCBI.

Has someone had a chance to reproduce the result reported in this reference ? It would be of great help if the data and the scripts used to generate the cluster map is available in any public repository.

clustering gene-expression • 292 views
ADD COMMENTlink modified 12 months ago by Biostar ♦♦ 20 • written 15 months ago by Natasha40
1

No, it doesn't. It says The raw intensity data were transformed to normalized expression levels with the robust multi-array average (RMA) low-level algorithm [2] implemented in the BioConductor package [3]. They used normalized intensity values and from this probably the differences between the samples. This is an array, not RNA-seq, so relative measures. Arrays can inform about differences between samples but you cannot derive anything from the intensity of a single gene. I would also be surprised if the author responded as it is a paper from 2006. The people involved (except the senior author) probably left many years back. If you want to reproduce then download the raw data, normalize, perform differential analysis and then cluster based on the obtained log2 fold-changes, maybe transformed to the Z-scale.

ADD REPLYlink written 15 months ago by ATpoint44k

Thanks a lot for the response. In the section on 'Microarray procedure' of the reference , "A gene atlas of the mouse and human protein-encoding transcriptomes" it is mentioned that the raw files can be found in http://symatlas.gnf.org. However, I couldn't locate the raw files.

ADD REPLYlink modified 15 months ago • written 15 months ago by Natasha40

@ATpoint Apparently, symatlas has been navigated to BioGPS and the supplementary files are available here.

I could find the same files on GEO with accession number GSE1133. However , the data is available in different formats like CDF, CIF, GIN, PSI, SIF, PROBE, TAB, TXT. I am not sure which data format has to be downloaded to implement the following suggestion given in the above response,

normalize, perform differential analysis and then cluster based on the obtained log2 fold-changes, maybe transformed to the Z-scale

ADD REPLYlink modified 15 months ago • written 15 months ago by Natasha40

Have you tried contacting the authors?

ADD REPLYlink written 15 months ago by Jean-Karim Heriche24k

Yes I wrote an email to Prof. Bork who is the corresponding author. Unfortunately, I didn't get any response yet.

ADD REPLYlink written 15 months ago by Natasha40
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 2073 users visited in the last hour
_