Question: Deseq2 Differentially Expressed Genes between conditions
0
gravatar for bryce.kirby
11 months ago by
bryce.kirby0 wrote:

Hello friends!

I have been working with my RNAseq output after running it thru the Deseq package in R. I have a list "genes.txt" of the genes in my 9 samples, the pvals, log2 fold change etc. I would like to compare lets say: Control vs. Knockdown and then use that input to use for GSEA.

Here is my phenotype file:

name    type
sample1 Control
sample2 KD
sample3 OE
sample4 Control
sample5 KD
sample6 OE
sample7 Control
sample8 KD
sample9 OE

Samples are grouped in 3s. So 1,2,3 are from the same cell line and so on for the rest.

This is the current code I am working with that provides the "global" values of all these samples, and Id like the output to take into account which values belong with which sample/condition(type). Is there a simplistic way to do this ? Id like to end up comparing the 3 controls vs the 3 KDs in GSEA as the next step.

Here is the code I currently have:

source("http://bioconductor.org/biocLite.R")
biocLite("ballgown")
biocLite
library("DESeq2")

countData <- as.matrix(read.csv("transcript_count_matrix.csv", row.names = "transcript_id"))
colData <- read.csv("pheno_data", sep="\t", row.names = 1)

all(rownames(colData) %in% colnames(countData))

countData<-countData[, rownames(colData)]
all(rownames(colData) == colnames(countData))

#create deseq DataSet from count matrix and labels
dds <- DESeqDataSetFromMatrix(countData = countData, colData = colData, design = ~ type)

#Run the default analysis for DESeq2 and generate results table

dds <- DESeq(dds)
res <- results(dds)

#Sort by adjusted p-value and display 

(resOrdered <- res[order(res$padj), ])

write.table(resOrdered, file = "genes.txt", sep ="\t")

#SORT BY CONDITION TYPE



# READ INTO GSEA SECTION

Any guidance on this I would greatly appreciate!

Thanks so much,

Bryce

ADD COMMENTlink modified 11 months ago by Buffo1.3k • written 11 months ago by bryce.kirby0
2
gravatar for Buffo
11 months ago by
Buffo1.3k
Buffo1.3k wrote:

You need to perform DESeq2 contrasts to compare samples, then filter results by pvalue, padjusted, etc and finally to GSEA.

ADD COMMENTlink written 11 months ago by Buffo1.3k

Hi thanks for the information! I believe I have made good progress using your help. I have the results from my run using that code, but I'd still like to add which values correspond to which sample. So basically, what would be the best method to add another column that categorizes which sample goes with which value? Also I'm not sure why its comparing OE3 with Control1 as the default? Any suggestions figuring this out I would really appreciate! Thank you!

> (resOrdered <- res[order(res$padj), ])
log2 fold change (MLE): group OE3 vs Control1 
Wald test p-value: group OE3 vs Control1 
DataFrame with 229551 rows and 6 columns
                 baseMean log2FoldChange     lfcSE         stat       pvalue       padj
                <numeric>      <numeric> <numeric>    <numeric>    <numeric>  <numeric>
ENST00000462898  801.1637     -13.050030  3.351083    -3.894273 9.849388e-05 0.09158196
ENST00000397492 1374.4390       8.500578  2.211868     3.843166 1.214570e-04 0.09158196
ENST00000482918  655.4429     -12.720482  3.310872    -3.842033 1.220194e-04 0.09158196
MSTRG.2890.8     691.0694     -12.466736  3.322664    -3.752030 1.754083e-04 0.09158196
ENST00000543146  877.2129     -12.163227  3.207941    -3.791599 1.496802e-04 0.09158196


#create deseq DataSet from count matrix and labels
dds <- DESeqDataSetFromMatrix(countData = countData, colData = colData, design = ~  type)

#----- Comparisons #Run the default analysis for DESeq2 and generate results table
dds$group <- factor(paste0(dds$type))
design(dds) <- ~ group

dds <- DESeq(dds)
resultsNames(dds)

#______



#dds <- DESeq(dds)
res <- results(dds)

#Sort by adjusted p-value and display 

(resOrdered <- res[order(res$padj), ])

write.table(resOrdered, file = "genes.txt", sep ="\t")

ADD REPLYlink written 11 months ago by bryce.kirby0

I usually do like below.

> conds <- c("Control","KD","OE","Control","KD","OE","Control","KD","OE")

> coldat=DataFrame(conds=factor(conds))

> dds <- DESeqDataSetFromMatrix(countData = countData, colData = colData, design = ~  type)


> res <- results(dds, contrast = c("conds","OE","Control"))

> res_sig <- subset(res, pvalue<=.05 & abs(log2FoldChange)>=1)
ADD REPLYlink written 11 months ago by mbk0asis410
1
gravatar for mbk0asis
11 months ago by
mbk0asis410
Korea, Republic Of
mbk0asis410 wrote:

To generate a nomalized count table, run

cnts_Norm <-  counts(dds, normalized = T).

Then, use the table for GSEA. And you may want to look for 'gage', a R package for GSEA.

ADD COMMENTlink modified 11 months ago • written 11 months ago by mbk0asis410
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1158 users visited in the last hour