Question

Deseq2 Differentially Expressed Genes between conditions

0

Entering edit mode

6.1 years ago

bryce.kirby • 0

Hello friends!

I have been working with my RNAseq output after running it thru the Deseq package in R. I have a list "genes.txt" of the genes in my 9 samples, the pvals, log2 fold change etc. I would like to compare lets say: Control vs. Knockdown and then use that input to use for GSEA.

Here is my phenotype file:

name    type
sample1 Control
sample2 KD
sample3 OE
sample4 Control
sample5 KD
sample6 OE
sample7 Control
sample8 KD
sample9 OE

Samples are grouped in 3s. So 1,2,3 are from the same cell line and so on for the rest.

This is the current code I am working with that provides the "global" values of all these samples, and Id like the output to take into account which values belong with which sample/condition(type). Is there a simplistic way to do this ? Id like to end up comparing the 3 controls vs the 3 KDs in GSEA as the next step.

Here is the code I currently have:

source("http://bioconductor.org/biocLite.R")
biocLite("ballgown")
biocLite
library("DESeq2")

countData <- as.matrix(read.csv("transcript_count_matrix.csv", row.names = "transcript_id"))
colData <- read.csv("pheno_data", sep="\t", row.names = 1)

all(rownames(colData) %in% colnames(countData))

countData<-countData[, rownames(colData)]
all(rownames(colData) == colnames(countData))

#create deseq DataSet from count matrix and labels
dds <- DESeqDataSetFromMatrix(countData = countData, colData = colData, design = ~ type)

#Run the default analysis for DESeq2 and generate results table

dds <- DESeq(dds)
res <- results(dds)

#Sort by adjusted p-value and display 

(resOrdered <- res[order(res$padj), ])

write.table(resOrdered, file = "genes.txt", sep ="\t")

#SORT BY CONDITION TYPE



# READ INTO GSEA SECTION

Any guidance on this I would greatly appreciate!

Thanks so much,

Bryce

RNA-Seq Deseq2 Differential Expression R GSEA • 3.1k views

ADD COMMENT • link updated 6.1 years ago by Buffo ★ 2.4k • written 6.1 years ago by bryce.kirby • 0

score 2 · Answer 1 · 2018-04-03

2

Entering edit mode

6.1 years ago

Buffo ★ 2.4k

You need to perform DESeq2 contrasts to compare samples, then filter results by pvalue, padjusted, etc and finally to GSEA.

ADD COMMENT • link 6.1 years ago by Buffo ★ 2.4k

0

Entering edit mode

Hi thanks for the information! I believe I have made good progress using your help. I have the results from my run using that code, but I'd still like to add which values correspond to which sample. So basically, what would be the best method to add another column that categorizes which sample goes with which value? Also I'm not sure why its comparing OE3 with Control1 as the default? Any suggestions figuring this out I would really appreciate! Thank you!

> (resOrdered <- res[order(res$padj), ])
log2 fold change (MLE): group OE3 vs Control1 
Wald test p-value: group OE3 vs Control1 
DataFrame with 229551 rows and 6 columns
                 baseMean log2FoldChange     lfcSE         stat       pvalue       padj
                <numeric>      <numeric> <numeric>    <numeric>    <numeric>  <numeric>
ENST00000462898  801.1637     -13.050030  3.351083    -3.894273 9.849388e-05 0.09158196
ENST00000397492 1374.4390       8.500578  2.211868     3.843166 1.214570e-04 0.09158196
ENST00000482918  655.4429     -12.720482  3.310872    -3.842033 1.220194e-04 0.09158196
MSTRG.2890.8     691.0694     -12.466736  3.322664    -3.752030 1.754083e-04 0.09158196
ENST00000543146  877.2129     -12.163227  3.207941    -3.791599 1.496802e-04 0.09158196


#create deseq DataSet from count matrix and labels
dds <- DESeqDataSetFromMatrix(countData = countData, colData = colData, design = ~  type)

#----- Comparisons #Run the default analysis for DESeq2 and generate results table
dds$group <- factor(paste0(dds$type))
design(dds) <- ~ group

dds <- DESeq(dds)
resultsNames(dds)

#______



#dds <- DESeq(dds)
res <- results(dds)

#Sort by adjusted p-value and display 

(resOrdered <- res[order(res$padj), ])

write.table(resOrdered, file = "genes.txt", sep ="\t")

ADD REPLY • link 6.1 years ago by bryce.kirby • 0

0

Entering edit mode

I usually do like below.

> conds <- c("Control","KD","OE","Control","KD","OE","Control","KD","OE")

> coldat=DataFrame(conds=factor(conds))

> dds <- DESeqDataSetFromMatrix(countData = countData, colData = colData, design = ~  type)


> res <- results(dds, contrast = c("conds","OE","Control"))

> res_sig <- subset(res, pvalue<=.05 & abs(log2FoldChange)>=1)

ADD REPLY • link 6.1 years ago by mbk0asis ▴ 690

score 1 · Answer 2 · 2018-04-03

1

Entering edit mode

6.1 years ago

mbk0asis ▴ 690

To generate a nomalized count table, run

cnts_Norm <-  counts(dds, normalized = T).

Then, use the table for GSEA. And you may want to look for 'gage', a R package for GSEA.

ADD COMMENT • link 6.1 years ago by mbk0asis ▴ 690