Hello Biostars Community,
I would like to compare the expression levels of specific genes between sample groups by making box plots like these:
The figure legend for the paper says:
Plotted values are quantile-normalized log2-cpm. For each group, all samples are plotted in addition to box-plots summarizing the group. * indicates adj. p < 0.05
I am starting with count matrices of different sample groups with technical replicates.
This was good help: How do you generate TMM normalized counts using EdgeR?
I am kind of convinced using TMM is the best method for this task based on the recommendations of this (HBC Training) source. Please correct me if I am wrong.
My main question granted the above is correct, is if someone could help explain when log transformation should be done? Is it necessary?
Should I do it here:
#/ make the DGEList: y <- DGEList(...) #/ calculate TMM normalization factors: y <- calcNormFactors(y) #/ get the normalized counts: cpms <- cpm(y, log=TRUE)
or instead replace the last line, like so:
log2cpms <- log2(cpm(y, log=FALSE))
Or should I not do it at all?
cpms <- cpm(y, log=FALSE)
Thank you very much in advance!