Question: How to input data for DESeq2 from individual HTSeq count?
2
gravatar for sudu87
22 months ago by
sudu8720
sudu8720 wrote:

I am comparing the gene expression of 2 bacteria under 1 condition. I have now the count tables for 3 tech. replicates for each bacteria.

Bacteria1_1.count 
Bacteria1_2.count 
Bacteria1_3.count

...same for the other bacteria.

These files look like this:

gene1 10000 
gene2 500 
gene3 0 
gene4 5000

I want to use DESeq2 for differential gene expression analysis. But I cannot figure out how to properly execute the DESeqDataSetFromHTSeqCount() command with this type of data.

Is there another intermediate step to add ?

rna-seq deseq deseq2 htseq • 4.4k views
ADD COMMENTlink modified 22 months ago by poojasethiya80 • written 22 months ago by sudu8720
5
gravatar for ZZzzzzhong
22 months ago by
ZZzzzzhong210
ZZzzzzhong210 wrote:
directory <- "/path/to/your/files/"

directory is where your htseq-count output files are located.

sampleFiles <- grep("Bacteria",list.files(directory),value=TRUE)

samplesFiles is a variable which points to your htseq-count output files,

condition <- c('Bacteria1','Bacteria1','Bacteria1','Bacteria2','Bacteria2','Bacteria2')

One for one for your sample type

sampleTable <- data.frame(sampleName = sampleFiles,
                      fileName = sampleFiles,
                      condition = condition)
library("DESeq2")
ddsHTSeq <- DESeqDataSetFromHTSeqCount(sampleTable = sampleTable,
                                   directory = directory,
                                   design= ~ condition)
ADD COMMENTlink written 22 months ago by ZZzzzzhong210

Thank you so much for this.

Sorry for these stupid questions but I have one more issue in:

sampleFiles <- grep("Bacteria",list.files(directory),value=TRUE)

I have 2 different bacteria names as the filenames for the .count files. For example, "cowan" and "isolate" are names of the bacteria. I tried grep-ing both at a time but it doesn't work. How can I can solve this?

Thanks a ton,

Sudip

ADD REPLYlink modified 22 months ago • written 22 months ago by sudu8720
2

Just like the variable condition

sampleFiles <- c('cowan1','cowan2','cowan3','isolate1','isolate2','isolate3')

remember sampleFiles correspond with condition

ADD REPLYlink written 22 months ago by ZZzzzzhong210

Hi ZZzzzzhong I am trying to follow the method you have suggested, however, I am getting an error " Error in data.frame(sampleName = sampleFiles, fileName = sampleFiles, : arguments imply differing number of rows: 0, 4".

I have checked the number of rows in all individual files , and they are same.

Here is my script

directory <- "C/RNA SEQ adv cgrp/IWAT" sampleFiles <- grep("COUNT FILES",list.files(directory),value=TRUE) condition <- c('237 COUNT FILES','264 COUNT FILES','267 COUNT FILES','265 COUNT FILES') sampleTable <- data.frame(sampleName = sampleFiles, + fileName = sampleFiles, + condition = condition)

ADD REPLYlink written 3 months ago by makwana.kd30
2
gravatar for poojasethiya
22 months ago by
poojasethiya80
poojasethiya80 wrote:

You can use following function to run DESeq2 on htseq-count output.

deseq_from_htseqcount.R

~ Pooja

ADD COMMENTlink modified 22 months ago • written 22 months ago by poojasethiya80
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 958 users visited in the last hour