Is There Any Way To Use The Log Normalized Ratios To Find Absolute Signal Intensities Of Every Gene?
1
0
Entering edit mode
10.2 years ago

I'm in need for a set of highly expressed constitutive set of genes for an organism of interest. I found microarray data for my organism where they have done whole transcriptome profiling for finding differential expression between two developmental stages of the same organism.

I have downloaded the microarray dataset of my organism from NCBI GEO. I have a file that contains raw data and another file containing log normalized (processed) data of every gene in the organism. The authors in their papers have indicated that genes with log2 normalized ratios between 0.6 and 1.7 are supposed to be the constitutively expressed gene set. The genes lying in the given range can be extracted from the processed data file of log normalized values in these ranges.

But If I have to find highly expressed genes from this whole set of constitutively expressed genes,I require individual intensity of the genes from the raw data so that I can compare and find out highly expressing constitutive genes.

Is there any way to use the log normalized ratios to find absolute signal intensities of that gene? If so, how can it be done? Also, if I can find absolute intensities from the raw data, which column in the raw data file should be chosen?

microarray normalization • 5.1k views
ADD COMMENT
1
Entering edit mode
10.2 years ago
seidel 11k

If all you have are ratios, the answer is NO, you can not extract intensity values from ratios. However, you mentioned that you have the raw data that was used to construct those ratios. So you simply have to match your ratios back to the raw data in a gene-wise fashion. The raw data consists of microarray intensity values that generally reflect expression levels (brighter signal in test relative to control indicates higher expression in test). Asking about specific columns in the absence of specific information about platform, etc. is impossible to answer. You have to explore and learn your data. However, one of the columns will reflect the values for the numerator of your ratios, while the other column will reflect the denominator.

edit: given your description of the platform in the comments below, they printed their own arrays, and used a genepix scanner (Molecular Devices) and software to quantify the results. The description of all the column headings can be found in the GenePix manual, and is a good read on how to quantify signal from a grid of pixels. Nonetheless, the values you are probably most interested in F647 Median, B647 Median, and F555 Median, B555 Median. Each of these represents the median Foreground and Background measurements for each channel (i.e. the red and green light, or the test and control samples) for each microarray spot. Depending on resolution, each spot might consist of 80 pixels, so the median is used to summarize the spot intensity (but you can see from the columns that you could choose the mean; there are many options there, you could choose to take the median of pixel/pixel ratios etc.). What you do next can be simple or complicated, the data is NOT normalized (which you can confirm by plotting it), you can choose to subtract background or not (opinions differ on whether this is important). So you should normalize, and then take ratios - but there are several methods of normalization. I think your best bet is to read the limma userguide for R (http://www.bioconductor.org/packages/release/bioc/vignettes/limma/inst/doc/usersguide.pdf) where you'll find a good discussion of the issues, and if you know any R you can explore the data rather quickly.

ADD COMMENT
0
Entering edit mode

Thank you for your prompt reply. The experiment was a two condition expt. each of these two conditions labelled by Alexa 647 and Alexa 555 dyes respectively. Though no platform was specifically mentioned, they mentioned in their paper that "The microarrays were printed on SuperChip (Erie Scientific) using a BioRobotics MicroGrid (Genomic solutions Inc, Ann Arbor, MI)." Also the raw data consists of the following columns. The data was processed by GenePix Pro. "Block Column Row Name ID X Y Dia. F647 Median F647 Mean F647 SD F647 CV B647 B647 Median B647 Mean B647 SD B647 CV % > B647+1SD % > B647+2SD F647 % Sat. F555 Median F555 Mean F555 SD F555 CV B555 B555 Median B555 Mean B555 SD B555 CV % > B555+1SD % > B555+2SD F555 % Sat. Ratio of Medians (647/555) Ratio of Means (647/555) Median of Ratios (647/555) Mean of Ratios (647/555) Ratios SD (647/555) Rgn Ratio (647/555) Rgn R2 (647/555) F Pixels B Pixels Circularity Sum of Medians (647/555) Sum of Means (647/555) Log Ratio (647/555) F647 Median - B647 F555 Median - B555 F647 Mean - B647 F555 Mean - B555 F647 Total Intensity F555 Total Intensity SNR 647 SNR 555 Flags Normalize Autoflag ". Now which columns should I choose from the above mentioned column headings? I'm a novice in microarrays and still learning.

ADD REPLY

Login before adding your answer.

Traffic: 2514 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6