Question: Normalization for comparing expression statistics across sample and gene groups
0
gravatar for endre.sebestyen
9 months ago by
endre.sebestyen10 wrote:

Hi,

What would be the current best practice for normalizing gene expression counts, if I want to compare different characteristics of genes and particluarly gene groups (min, max, mean, sd of expression) between two conditions? I'm interested in questions like: "Is the variance of expression means in condition A is larger than in condition B for a specific gene group?". So genes in group X have more variable mean expression in A than in B, while this is not true for gene group Y.

I guess I have to normalize for library size, gene length, and also correct for the mean-variance dependence of expression.

Maybe vst + rpkm or tpm transformation? Any other suggestions? Not sure if I can do an rpkm or tpm transformation after vst.

This is an example dataset:

data <- as.data.frame(matrix(rpois(100, lambda = 10), ncol = 5))
colnames(data) <- c("A1", "A2", "A3", "B1", "B2")
genes <- paste0("gene", 1:20)
gene_group <- c(rep("X", 15), rep("Y", 5))
data <- cbind(data, genes, gene_group)
ADD COMMENTlink modified 5 months ago by kristoffer.vittingseerup3.5k • written 9 months ago by endre.sebestyen10
0
gravatar for kristoffer.vittingseerup
5 months ago by
European Union
kristoffer.vittingseerup3.5k wrote:

I would as you suggest use vst and then afterwards normalize for gene-length as well ( vst / gene_length * 1e3 ). You could check with the cqn package if GC normalisation seems to be needed.

ADD COMMENTlink written 5 months ago by kristoffer.vittingseerup3.5k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 2738 users visited in the last hour
_