**40**wrote:

Hi All,

We have a specific gene mutation and we would like to learn how it is effective on Breast cancer.

So using the R, I get the mutation information from sequenced cases of TCGA Provisional and then stratified patients into two categories as Mutated & Wild Type. I downloaded the mRNA Expression z-Scores (RNA Seq V2 RSEM) from the cBioPortal website. I would like to look at the differentially expressed gene between these two groups but I have several questions :

The RNA seq data is Rsem.normalized, before I do any further analysis I transformed them into log2(rsem+1), that is correct right ?

For differential gene expression analysis what do you suggest me to use ? I cannot use DeSEQ2 or edgeR as they require raw counts as input.

I used limma package but I guess I get shows my data has some problem . Does it look ok or should I do something else ?

```
library(edgeR)
library(limma)
group = c( rep("Mut", 191), rep("WT", 660))
design <- model.matrix(~ 0 + group)
colnames(design) <- c("Mut", "WT")
y = TCGA_comb
par(mfrow=c(1,2))
v <- voom(y,design,plot = TRUE)
fit <- lmFit(v, design)
cont.matrix <- makeContrasts(PIK3CA_mutVSwt=Mut - WT,levels=design)
fit.cont <- contrasts.fit(fit, cont.matrix)
fit.cont <- eBayes(fit.cont)
plotSA(fit.cont)
summa.fit <- decideTests(fit.cont)
tab <- topTable(fit.cont, n=Inf, coef="PIK3CA_mutVSwt")
```

Would it be too superficial if I calculate Fold Change, p-value & FDR on my own?

`a) Fold change: Take average of each gene per group and then Log2(B)-Log2(A) b) p-value: t.test command of R c) FDR: p.adjust(pvalue,method="fdr")`

Many many thanks,

Gokce