Question: Mutation effect on expression (RNA-Seq) using limma
gravatar for Nicolas Rosewick
3.8 years ago by
Belgium, Brussels
Nicolas Rosewick8.8k wrote:


I'm investigating the potential effect of several mutations on gene expression. I've RNA-seq from ~30 tumors and 4 control samples. I screened for these 30 tumors about 20 genes for mutations. I translated these mutation data into a binary matrix (rows = samples, columns = genes ; 0 = not mutated, 1 = mutated).

For the RNA-Seq I aligned with STAR and counted the number of reads per gene with featurecounts. Then I merged everything into a big read count matrix (column = sample ; rows=gene).

Now I want to detect genes which expression is affected by these mutations. So I used limma as follow:

The design matrix harbors a column for each gene (~20), and an additional column indicating if the samples is a tumor or not. I also add an offset column. For example :

mut dataframe :

         offset geneA   geneB   geneC   isTumor
sample1      1      0       1       1         1
sample2      1      1       1       0         1  
sample3      1      0       0       1         0

So :

design <- mut
design <- design[,colSums(design)>2] # remove gene that is mutated in less than 3 samples
design <- cbind(offset=1,design) # add offset
# egdeR
dge <- DGEList(counts=counts)
dge <- calcNormFactors(dge)
cpm <- cpm(dge,log=T)
keep.exprs <- rowSums(cpm>1)>=3 # min 3 samples with CPM > 1
dge <- dge[keep.exprs,]
# limma voom
geneExpr <- voom(dge,design)
glm = lmFit(geneExpr$E, design = design) 
glm = eBayes(glm)

testResults <- decideTests(glm, method="global",adjust.method="BH", p.value=0.05)[,-1]

Now in testResults I've a matrix containing for each gene and each covariate (the 20 screened genes for mutation) if it's up (=1) or down regulated (=-1) ; (no significant change = 0). Is that correct for you ?


edit : I posted the question to bioconductor forum as adviced by Goutham Atla.

rna-seq limma • 1.5k views
ADD COMMENTlink modified 3.8 years ago • written 3.8 years ago by Nicolas Rosewick8.8k

You would get much quicker response on for questions related to statistics and bioconductor packages. Most of the authors are very active there.

ADD REPLYlink written 3.8 years ago by geek_y11k
gravatar for Devon Ryan
3.8 years ago by
Devon Ryan95k
Freiburg, Germany
Devon Ryan95k wrote:

What you're doing doesn't make any sense. You're applying the same model matrix to every gene. What you want is a different model matrix for each gene. That is, for each gene you want a design of the form ~tumour + mutant, where each of those is a factor with two levels. How to do this has been covered elsewhere.

ADD COMMENTlink written 3.8 years ago by Devon Ryan95k

Thanks Devon. FYI I based my analysis on this paper : . Supplementary data 2 contains the original R code I used to wrote the few lines above. So I'm confused now because their experimental design is quiet similar..

ADD REPLYlink written 3.8 years ago by Nicolas Rosewick8.8k

Perhaps I'm misunderstanding the question you're trying to answer with the test then. I assumed you were trying to see if, in your tumor and normal samples, the presence of a mutation in gene X had an effect on the expression of gene X. If, instead, you want to see if mutations in gene X might be affecting gene Y or Z then this method makes sense.

ADD REPLYlink written 3.8 years ago by Devon Ryan95k

Yes I want to see the effect of mutation in gene X on all the genes in the count table. And that for all of the 20 genes I screened.

ADD REPLYlink written 3.8 years ago by Nicolas Rosewick8.8k

OK, then it totally makes sense to me, for whatever that's worth :)

ADD REPLYlink written 3.8 years ago by Devon Ryan95k
Please log in to add an answer.


Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1218 users visited in the last hour