Randomising row order in a subset of columns of a gene expression matrix (R)
7.4 years ago
m.fletcher ▴ 20

Hello all,

I'm using R to investigate gene co-expression modules in a dataset and I would like to generate some 'noise' in the matrix, and then re-computing the co-expression modules using this noisy matrix. This would give me an idea of the robustness of the modules.

(In this case we have rows = genes, columns = samples)

The best Idea so far would be to re-order the rows for a subset of samples (and change the #samples reordered to simulate increasing amounts of noise).

Unfortunately I cannot find a way to reorder a subset of columns in a quick manner, e.g.

    # copy gx matrix to scramble
    gx <- gx.filt.test

    # generate sample of cols to scramble:
    scr.cols <- sample(x=ncol(gx), size=scr.number)

    for(j in 1:length(scr.cols) )
        gx[ , scr.cols[j] ] <- as.numeric( sample( x=gx[ , scr.cols[j] ], size=nrow(gx) ) )

Is unsurprisingly very slow due to the for() loop.

Using apply() is certainly faster, but is still quite slow:

gx[,scr.cols] <- apply( X=gx[,scr.cols], MARGIN=2, FUN=function(s) as.numeric(sample(s, size=length(s))) )

Is there an obvious way to do this that I'm missing, or a package that has a function to do so quickly?

Thanks very much in advance!

Entering edit mode

Is this something that vegan::permat could be used for?

Entering edit mode

Ah, yes, the strata option looks like it could possibly do the job - although from my reading of the manpage, it looks like that the permutations still occur within each specified stratum; whereas I want to permute the data within only one of those strata.

Thank you for the suggestion, I will try it out next time - it turns out that the co-expression calculations take far longer than the sampling, so making this step slightly faster has no effect on the total runtime anyway...!


