GSVA analysis and limma feature
1
1
Entering edit mode
6.1 years ago
derricca0708 ▴ 10

Hello, I'm working on GSVA and further analysis using Limma just like the https://bioconductor.org/packages/release/bioc/vignettes/GSVA/inst/doc/GSVA.pdf figure 4.

My data is clustered into 3 types, according to the TCGA protocol for LUAD. I followed the description from the GSVA.pdf and the answers from biostars written by Kevin Blighe.

{My R code for analysis}

lung_es <- gsva(exampleSet, gm, method = "gsva", min.sz=10, max.sz=9999, verbose=TRUE, abs.ranking=FALSE, no.bootstraps=0, bootstrap.percent = .632)
adjPvalueCutoff <- 0.05  #P-value 
logFCcutoff <- log2(2)
design <- model.matrix(~factor(lung_es$TCGA_subtype))
colnames(design) <- c("TRU_Bron", "PI_Squam", "PP_Magn")
fit <- lmFit(lung_es, design)
fit <- eBayes(fit)
allGeneSets <- topTable(fit, coef=NULL, number=Inf)
DEgeneSets <- topTable(fit, coef=NULL, number=Inf, p.value=adjPvalueCutoff, adjust="BH")
res <- decideTests(fit, p.value=adjPvalueCutoff)
summary(res)
#          TRU_Bron PI_Squam PP_Magn
#Down          8      168     209
#NotSig     1143      834     793
#Up           13      162     162
sig.genes <- c(names(res[res[,2]==1,2]), names(res[res[,2]==-1,2]))
lung_eset.siggenes <- lung_es[which(rownames(lung_es) %in% sig.genes),]

so, this was the answer and I followed without question... turns out the example (leukemia) has two subtype (ALL, MLL)

BUT, I have three subtype. So i think that code getting sig.genes should be different.

Any suggestion about getting differently activated or deactivated pathways among three subtypes??

R gene • 4.7k views
ADD COMMENT
0
Entering edit mode
6.1 years ago

Hello, I see my name mentioned in your question, and you must be referring to this answer that I gave: GSVA enrichment score and heatmap

If you have 3 conditions, then I would repeat the code 3 times in order to cover all possible pairwise comparisons. Does that make sense?

ADD COMMENT
0
Entering edit mode

Thank you for the quick and kind answer... By 'repeat code 3 times', do you mean that I'll have to repeat the code

sig.genes <- c(names(res[res[,2]==1,2]), names(res[res[,2]==-1,2]))

lung_eset.siggenes <- lung_es[which(rownames(lung_es) %in% sig.genes),]

to

sig.genes1 <- c(names(res[res[,1]==1,2]), names(res[res[,1]==-1,2]))

sig.genes2 <- c(names(res[res[,2]==1,2]), names(res[res[,2]==-1,2]))

sig.genes3 <- c(names(res[res[,3]==1,2]), names(res[res[,3]==-1,2]))

or are you suggesting I should repeat from the 'lmFit'?

ADD REPLY
0
Entering edit mode

I was more thinking about creating the model matrix with the 3 levels/conditions and then doing different contrasts, as one would normally do in Limma via makeContrasts(). Are you very familiar with the limma pipeline?

ADD REPLY
0
Entering edit mode

Thank you again for the answer.. I'm not familiar with the limma pipeline,

but I found makeContrasts() and other codes in Limma users guideline's 9.3 Several Groups. It explains how to compare three groups.

> adjPvalueCutoff <- 0.05
> 
> logFCcutoff <- log2(2)
> 
> f <- factor(lung_es$TCGA_subtype, levels = c("TRU_Bron", "PI_Squam", "PP_Magn"))
> design1 <- model.matrix(~0+f)
> colnames(design1) <- c("TRU_Bron", "PI_Squam", "PP_Magn")
> fit <- lmFit(lung_es, design1)
> contrast.matrix <- makeContrasts(PI_Squam-TRU_Bron, PP_Magn-PI_Squam, TRU_Bron-PP_Magn, levels = design1)
> fit2 <- contrasts.fit(fit, contrast.matrix)
> fit2 <- eBayes(fit2)
> 
> top <- topTableF(fit2, number= 100)

the guide said 'The statistic fit2$F and the corresponding fit2$F.p.value combine the three pair-wise comparisons into one F-test.'

does this mean that if I want to find out genes or pathways vary between three subtypes, I should look into fit2$F.p.value??

ADD REPLY
0
Entering edit mode

Yes, you have done it correctly in that example. You then have multiple options when you want to derive the statistics:

  1. Derive separate stats for each pairwise comparison
  2. Derive ANOVA stats comparing all 3 groups

Both of these require the use of coefficients, which are specified in contrasts.fit I believe. There was actually a similar question posed on BIoconductor, where one of the key developers of Limma gave a fairly good response (and suggests to look at the manual!):

I would look myself but I am pressed for time.

Obviously, once you get your genes, these are then passed to gsva() for enrichment

ADD REPLY

Login before adding your answer.

Traffic: 2698 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6