Analyzing counts over 3 different timepoints and 2 conditions with DESeq2
3
0
Entering edit mode
11 weeks ago
cdeantoneo31 ▴ 10

I am trying to analyze data that has three different timepoints (0, 6, and 12 hr) and two conditions (treated and control), but every attempt I make to carry this out with DESeq2 is met with error.

The latest is below, which is an attempt I made using this brief tutorial

 counts<-read.csv("E:/Mac_data/macs_counts_11_3_21.csv", header=TRUE)
X Merged_KO_0hr.bam Merged_KO_12hr.bam Merged_KO_6hr.bam Merged_WT_0hr.bam
1 0610005C13Rik                 4                  3                 2                 6
2 0610006L08Rik                 0                  0                 0                 0
3 0610009B22Rik              1040                509               663              1082
4 0610009E02Rik                14                  9                10                 8
5 0610009L18Rik               151                 62                99               125
6 0610010F05Rik              2524                699              1604              2121
Merged_WT_12hr.bam Merged_WT_6hr.bam
1                  1                 2
2                  0                 0
3                540               644
4                 15                 8
5                 45                50
6                683              1318
counts_matrix <- data.matrix(counts)
X Merged_KO_0hr.bam Merged_KO_12hr.bam Merged_KO_6hr.bam Merged_WT_0hr.bam
[1,] 1                 4                  3                 2                 6
[2,] 2                 0                  0                 0                 0
[3,] 3              1040                509               663              1082
[4,] 4                14                  9                10                 8
[5,] 5               151                 62                99               125
[6,] 6              2524                699              1604              2121
Merged_WT_12hr.bam Merged_WT_6hr.bam
[1,]                  1                 2
[2,]                  0                 0
[3,]                540               644
[4,]                 15                 8
[5,]                 45                50
[6,]                683              1318
#set exp design and coldata
exp_design_file <- file.path("mac_exp_design_11_3.csv")
exp_design <- read.csv(exp_design_file, stringsAsFactors = FALSE)
sample Condition Timepoint
1  Merged_WT_0hr.bam   treated        0h
2  Merged_WT_6hr.bam   treated        6h
3 Merged_WT_12hr.bam   treated       12h
4  Merged_KO_0hr.bam   control        0h
5  Merged_KO_6hr.bam   control        6h
6 Merged_KO_12hr.bam   control       12h
DataFrame with 6 rows and 3 columns
sample   Condition   Timepoint
<character> <character> <character>
1  Merged_WT_0hr.bam     treated          0h
2  Merged_WT_6hr.bam     treated          6h
3 Merged_WT_12hr.bam     treated         12h
4  Merged_KO_0hr.bam     control          0h
5  Merged_KO_6hr.bam     control          6h
6 Merged_KO_12hr.bam     control         12h
#DESeq2
full_model <- ~ sample + Condition + Timepoint + Condition:Timepoint
reduced_model <- ~ sample + Condition + Timepoint
dds <- DESeqDataSetFromMatrix(countData = counts, colData = coldata,
+                               design = ~ sample + Condition +
+                                 Timepoint + Condition:Timepoint)
Error in DESeqDataSetFromMatrix(countData = counts, colData = coldata,  :
ncol(countData) == nrow(colData) is not TRUE


I would appreciate any help. I understand WHAT the error message is saying, but I don't know how to fix it.

Additionally, if there is a better way to handle this kind of data I would also appreciate feedback in that regard. I'm here to learn!

noob deseq deseq2 • 609 views
2
Entering edit mode
11 weeks ago

Look at your data. What is dim(counts)? And why?

I strongly recommend that you put down your own data and go through some sample data from the vignette or a tutorial before you try to analyze your own data. After you've done that, you can do your data side by side, so you can catch when your data isn't looking like it should.

0
Entering edit mode

fair enough. do you think this is a good place to start?

http://master.bioconductor.org/packages/release/workflows/html/rnaseqGene.html

1
Entering edit mode

Sure. That one can be a bit overwhelming in the beginning, just note that it goes over a lot of different ways to import data; you'll only need one such method at a time.

0
Entering edit mode

I figured it out :) Ty for reminding me to slow down and start with the basics instead of jumping right in with my own data

1
Entering edit mode
11 weeks ago

Hi,

In your case that error message is due to the colnames(counts) do not match, in number or names, the rownames(coldata). For this, I highly recommend you to inspect the number and order of your rownames in the coldata object and verify whether it is the same in the colnames of your counts. If not, use a text plain editor (if you are not familiar with R) to match them and make sure that you have the same number of samples in both objects.

Best regards,

1
Entering edit mode
11 weeks ago
Bane ▴ 10

Well, I had the same problem a few months ago and found this tutorial. It solved everything;

Factor Designing