Help with DESeqDataSetFromHTSeqCount?
1
0
Entering edit mode
6.8 years ago
sandKings ▴ 40

Hello everyone,

My knowledge of R is mostly copy-pasting and a lot of googling. As of now, I have managed to successfully generate my HTSeq-count files with the help here.

my count files constitutes of samples coming from same patients pre and post-treatment

condition fileName
pre p1.pre.htseq.count
pre p2.pre.htseq.count
pre p3.pre.htseq.count
post p1.post.htseq.count
post p2.post.htseq.count
post p3.post.htseq.count

........ so on and so forth. In all, I have 30 count files (from 30 patients) in the pre treatment group and 29 files in the post-treatment group (1 patient didn't come back for follow up).

I'm trying to follow the example here:

http://www.bioconductor.org/packages/3.7/bioc/vignettes/DESeq2/inst/doc/DESeq2.html#htseq-count-input

I guess I should start from this point onwards in the vignette and modify the code below to work with my samples:

directory <- "/path/to/your/files/"
sampleFiles <- grep("treated",list.files(directory),value=TRUE)
sampleCondition <- sub("(.*treated).*","\\1",sampleFiles)
sampleTable <- data.frame(sampleName = sampleFiles,
                          fileName = sampleFiles,
                          condition = sampleCondition)

I tried to edit it:

directory <- "/path/to/your/files/"
sampleFiles <- grep("counts",list.files(directory),value=TRUE) #because all my files have the string 'counts'
sampleCondition <- sub("(.*counts).*","\\1",sampleFiles)
sampleTable <- data.frame(sampleName = sampleFiles,
                          fileName = sampleFiles,
                          condition = sampleCondition)

when I run this, this gives me 3 columns:

sampleName fileName condition

and in all the 3 columns, it lists my filename.

To begin with, how do I get the condition column to populate with pre and post?

Can someone help me to edit the script to just check for the differentially expressed gene between the pre and post group? Or maybe direct me to another tutorial which is more up to my speed?

Please let me know if I need to add more details to the question.

Thanks so much in advance!

DESeqDataSetFromHTSeqCount RNA-Seq Deseq2 • 7.1k views
ADD COMMENT
0
Entering edit mode

Thank you so much! It worked. I can replace 'dds' with 'ddsHTSeq'. Sorry, it's probably obvious but I barely know R as you can tell. as in:

keep <- rowSums(counts(ddsHTSeq)) >= 10
dds <- ddsHTSeq[keep,]
ADD REPLY
0
Entering edit mode

Please use ADD COMMENT/ADD REPLY when responding to existing posts to keep threads logically organized. This comment belongs under @Devon's answer.

ADD REPLY
0
Entering edit mode

Sorry! Will pay more attention next time.

ADD REPLY
0
Entering edit mode

I added code markup to your post for increased readability. You can do this by selecting the text and clicking the 101010 button. When you compose or edit a post that button is in your toolbar, see image below:

101010 Button

ADD REPLY
0
Entering edit mode

Ok! Thanks for pointing it out to me.

ADD REPLY
0
Entering edit mode

Happy to help. Another remark:

If an answer was helpful you should upvote it, if the answer resolved your question you should mark it as accepted. Upvote|Bookmark|Accept

ADD REPLY
3
Entering edit mode
6.8 years ago
sampleCondition = sapply(strsplit(sampleFiles, ".", fixed=T), function(x) x[2])

The remainder is covered in the DESeq2 tutorial.

ADD COMMENT

Login before adding your answer.

Traffic: 813 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6