Question: Help with DESeqDataSetFromHTSeqCount?
0
gravatar for sandKings
4 months ago by
sandKings10
sandKings10 wrote:

Hello everyone,

My knowledge of R is mostly copy-pasting and a lot of googling. As of now, I have managed to successfully generate my HTSeq-count files with the help here.

my count files constitutes of samples coming from same patients pre and post-treatment

condition fileName
pre p1.pre.htseq.count
pre p2.pre.htseq.count
pre p3.pre.htseq.count
post p1.post.htseq.count
post p2.post.htseq.count
post p3.post.htseq.count

........ so on and so forth. In all, I have 30 count files (from 30 patients) in the pre treatment group and 29 files in the post-treatment group (1 patient didn't come back for follow up).

I'm trying to follow the example here:

http://www.bioconductor.org/packages/3.7/bioc/vignettes/DESeq2/inst/doc/DESeq2.html#htseq-count-input

I guess I should start from this point onwards in the vignette and modify the code below to work with my samples:

directory <- "/path/to/your/files/"
sampleFiles <- grep("treated",list.files(directory),value=TRUE)
sampleCondition <- sub("(.*treated).*","\\1",sampleFiles)
sampleTable <- data.frame(sampleName = sampleFiles,
                          fileName = sampleFiles,
                          condition = sampleCondition)

I tried to edit it:

directory <- "/path/to/your/files/"
sampleFiles <- grep("counts",list.files(directory),value=TRUE) #because all my files have the string 'counts'
sampleCondition <- sub("(.*counts).*","\\1",sampleFiles)
sampleTable <- data.frame(sampleName = sampleFiles,
                          fileName = sampleFiles,
                          condition = sampleCondition)

when I run this, this gives me 3 columns:

sampleName fileName condition

and in all the 3 columns, it lists my filename.

To begin with, how do I get the condition column to populate with pre and post?

Can someone help me to edit the script to just check for the differentially expressed gene between the pre and post group? Or maybe direct me to another tutorial which is more up to my speed?

Please let me know if I need to add more details to the question.

Thanks so much in advance!

ADD COMMENTlink modified 4 months ago • written 4 months ago by sandKings10

Thank you so much! It worked. I can replace 'dds' with 'ddsHTSeq'. Sorry, it's probably obvious but I barely know R as you can tell. as in:

keep <- rowSums(counts(ddsHTSeq)) >= 10
dds <- ddsHTSeq[keep,]
ADD REPLYlink modified 4 months ago by WouterDeCoster28k • written 4 months ago by sandKings10

Please use ADD COMMENT/ADD REPLY when responding to existing posts to keep threads logically organized. This comment belongs under @Devon's answer.

ADD REPLYlink written 4 months ago by genomax47k

Sorry! Will pay more attention next time.

ADD REPLYlink written 4 months ago by sandKings10

I added code markup to your post for increased readability. You can do this by selecting the text and clicking the 101010 button. When you compose or edit a post that button is in your toolbar, see image below:

101010 Button

ADD REPLYlink written 4 months ago by WouterDeCoster28k

Ok! Thanks for pointing it out to me.

ADD REPLYlink written 4 months ago by sandKings10

Happy to help. Another remark:

If an answer was helpful you should upvote it, if the answer resolved your question you should mark it as accepted. Upvote|Bookmark|Accept

ADD REPLYlink written 4 months ago by WouterDeCoster28k
3
gravatar for Devon Ryan
4 months ago by
Devon Ryan79k
Freiburg, Germany
Devon Ryan79k wrote:
sampleCondition = sapply(strsplit(sampleFiles, ".", fixed=T), function(x) x[2])

The remainder is covered in the DESeq2 tutorial.

ADD COMMENTlink written 4 months ago by Devon Ryan79k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1978 users visited in the last hour