Forum:Choosing the appropriate test for differential analysis: Limma or Mann-Whitney (with BH correction)? Test for normality
Entering edit mode
4.1 years ago
innyus ▴ 40

Hi everyone!

Beginner at R.

I am currently analyzing immunohistochemistry expression in two different but related cancers. The data consists of 140 samples and over 250 different variables.

I've done the Shapiro-Wilk test with Bonferroni correction to check for normality and only 11 out of 250 celltypes/marker expression pass the test (i.e. p value over 0.05). I have also checked it visually by doing Q-Q plots for some of the variables (with very low and high p.values) and it seems to check out. So, I did the DE-analysis using M-W test with BH correction instead of limma. Did the limma just to compare and the results differ, limma gives less significant variables with adj.p.val below 0.05.

I've also applied the same test to Nanostring data (pre-processed with nSolver, log-2 transformed, house-keeping genes removed) cohort with 105 samples which I assumed would follow a normal distribution but only around 50% of genes had adj.p.value over 0.05, should I care about that? Does the sample size play a role (e.g. 10 samples vs 100 samples)?

I have used the following code to check for normality:

shapiro_test_df <- function(df, bonf= T, alpha= 0.05) {
  l <- lapply(df, shapiro.test)
  s <-"c", lapply(l, "[[", 1))
  p <-"c", lapply(l, "[[", 2))
  if (bonf == T) {
    sig <- ifelse(p > alpha / length(l), "H0", "Ha")
  } else {
    sig <- ifelse(p > alpha, "H0", "Ha")
  return(list(statistic= s,
              p.value= p,
              significance= sig,
              method= ifelse(bonf == T, "Shapiro-Wilks test with Bonferroni Correction",
                             "Shapiro-Wilks test without Bonferroni Correction")))

list <- shapiro_test_df(data)

data.adj.p.val <- list$p.value
data.adj.p.val <-

And this code for M-U test with BH correction:

out <- lapply(27:280, function(x) pairwise.wilcox.test(data[[x]], data$Subtype))
names(out) <- names(data)[27:280] 

a <- sapply(out, function(x) {
  p <- x$p.value
  n <- outer(rownames(p), colnames(p), paste, sep='v')
  p <- as.vector(p)
  names(p) <- n

m.w.adj.P.Val <- round(p.adjust(a, "BH"), 5)
m.w.adj.P.Val <-

Question 1: Test for normality, is this the to go? Should one use M-W test with BH correction in this case (instead of limma)?

Question 2: Does the sample size matter, e.g. 10 vs 100 samples for normality test? Lets say I want to check differential expression according to response in a small sample.

Question 3: Can M-W test with BH correction somehow damage the results if the data is normally distributed?

Question 4: Does the code for normaly testing and M-W looks right?

R Shapiro-Wilk Limma Mann-Whitney • 2.2k views
Entering edit mode

With immunehistochemistry you mean that you stained tissue slides with an antibody and then recorded the signal? Or what kind of data is this?

Entering edit mode

Yes. Multiplex IHC which is then digitally quantified.

Entering edit mode

ATpoint do information about immunehistochemistry really matter here. Could you please elaborate how this test will vary when different immunehistochemistry is used.

Entering edit mode

It always matters which assay you have. Some assays give discrete values, others continuous ones, some only are informative towards changes between conditions (such as arrays), others have a wider dynamic range (such as RNA-seq). I am not a statistician, but given the large amount of samples then you probably are good to go with a Wilcox test followed by BH correction. Traditional tests are often underpowered when you have small samples sizes like n=3, this is why RNA-seq needs special tests that increases power by borrowing information across all genes. If n becomes larger as in scRNA-seq then people go back using tests like t-tests and Wilcox because the large n provides enough power while still being computationally inexpensive.

Entering edit mode

Thank you!

The data I am using is continous for each IHC marker. Then again, not sure if ”continuity” of the data is applicable between different markers...because they are different antibodies. The data is numerical with range 0-1 corresponding to proportion of cells in TMA positive for a particular marker e.g. IDO1 or TIM3 positivity. So some samples would have basically 0 expression and som eg 0.1 (10%).

So when I did the Shapiro-Wilks test only 7-10% of all markers passed the test for normality (comp to over 50% for Nanostring gene data) and Q-Q plots looked teribble for samples with low p values in S-W test.

That is why I chose a non-parametric test (Wilcox test) for DE analysis, instead of using limma package. Hope that is the right way. Question in my original post are above.


Login before adding your answer.

Traffic: 1293 users visited in the last hour
Help About
Access RSS

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6