Question: normalised data file extract values to do calculation
gravatar for cara78
4.5 years ago by
cara7810 wrote:


I have a normalized file referred to in my programme as "geneSummaries". For each gene across the samples I want to get the median. I am using a For loop to do this.

Normalised the .CEL files which point to geneSummaries

exprs(geneSummaries)-> num
for(i in num){
    geneSummaries[-1, ]
    geneSummaries[ ,i]
    Will be doing the calculation here....

So my Question is to do with pulling the data out of the file to do the calculation. I want it drop the 1st column as this is just the name of the sample. See below a sample example of it, when I opened it.

7945460  7.471390  7.256158  7.287770  7.545794 
7945462  7.609366  7.528324  7.324294  7.310791
7945475  5.375443  5.749566  5.519073  5.806861

So to do that I used geneSummaries[-1, ] but it doesn't work

And to pull information out of the ith column I used geneSummaries[ ,i] so can later do calculation, which also doesn't work.

Could someone suggest a idea on how to do this, please.

R extract normalized matrix • 1.2k views
ADD COMMENTlink modified 4.5 years ago by rmnc20070 • written 4.5 years ago by cara7810

Thanks so much .........

ADD REPLYlink modified 4.5 years ago by Devon Ryan86k • written 4.5 years ago by rmnc20070
gravatar for Devon Ryan
4.5 years ago by
Devon Ryan86k
Freiburg, Germany
Devon Ryan86k wrote:

I'm assuming that geneSummaries is an eSet. If you just want the median, then it'd be a lot simpler to just:

medians <- apply(exprs(geneSummaries), 1, median)

I'm guessing that the rows are actually then probes/genes rather than the columns. The exprs() accessor doesn't return the row or column names, those are simply displayed when a matrix is shown on screen.

ADD COMMENTlink modified 4.5 years ago • written 4.5 years ago by Devon Ryan86k


I have a another question please. I am doing this survival analysis and the plan is :

Normalised data

Determine number genes

For every ith gene on the array

                Find median of ith gene across samples

Make gene expression 1 or 0 depending on whether or not raw expression value is above or below median (diGene)

gene_survival <- coxph(Surv(survival time, status)~diGene)

My code I have so far :

cel_files <- dir(data_directory, full.names = T, pattern = ".CEL")

norm_data <- just.gcrma(cel_files)

exprs(norm_data)-> num

# for loop
for(i in num){                       
medians <- apply(exprs(norm_data), 1, median)

    if(medians > norm_data){
       diGene <- 1;
    } else {
       diGene <- 0;

My question is to do with the comparison " if(medians > norm_data)" I need to compare to the raw expression but there is an error "

Error in medians > norm_data :
  comparison (6) is possible only for atomic and list types"

Am I wrong in thinking that norm_data is the raw data (UN-normalised) or those raw expression mean before Quality control and probeset filtration done ?







ADD REPLYlink modified 4.5 years ago • written 4.5 years ago by cara7810

Firstly, remove the for(i in num) loop. That'll iterate over every value in the matrix (not to mention that medians <- apply(exprs(geneSummaries), 1, median) will produce the matrix of row medians in one line), which isn't what you want (and anyway, it isn't used).

Secondly, medians > geneSummaries doesn't make sense since medians is a vector of values and geneSummaries is an eSet object (thus leading to that error).

Finally, regressing on whether a genes expression in a subject is above/below the mean is a bad idea. Firstly, you'll be performing enough regressions that your power will be atrocious. Secondly, even if a gene happened to correlate itself, this isn't exactly meaningful/useful if the variance of the gene is itself really small.

I would recommend reassessing your approach before spending more time on it.

ADD REPLYlink written 4.5 years ago by Devon Ryan86k

Ok thanks will have another look at it.

ADD REPLYlink written 4.5 years ago by cara7810
Please log in to add an answer.


Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1240 users visited in the last hour