Question: input for DEseq2 differential expression and multi comparisons between samples
0
gravatar for Safa.A
19 months ago by
Safa.A0
United States
Safa.A0 wrote:

I have RNAseq data for 24 samples divided as 8 different conditions and 3 biological replicates for each condition. the 8 conditions are for two different plants with 4 similar ages for each plant. the phenotypic data for these plants are either susceptible or resistant according to what age. the first plant has two susceptible ages and two resistant ages while the other plant has three susceptible ages and one resistant age.

I am using the htseq-DESeq2 pipeline to do the differential expression. my goals are: first, compare the different ages S with S and R with R for plant 1 then compare the list of genes from S to the list of genes with R. do the same comparison for plant 2 except that I have 3 S ages: 1 R age so I need to compare all three S ages with the R age.

second, compare two ages 14 and 21 between the two plants as 14 : 14 and 21 : 21.

Third, I need to know if the resistance phenotype is plant dependent means that there are unique R genes for each plant or there are common genes for R phenotype for both plants.

Here is my R code for the DESeq2:

sampleFiles <- list.files(path="/to/htseq-output")
directory <- c("/main directory/")
sampleCondition<- read.table("path/to/phenodata.txt",head=TRUE)
 sampleTable <- data.frame(sampleName = sampleFiles, fileName = sampleFiles, condition = sampleCondition)

my sampleCondition file is:

sampleID    cultivar    phenotype   age
C7-1    C   S   7
C7-2    C   S   7
C7-3    C   S   7
C10-1   C   S   10
C10-2   C   S   10
C10-3   C   S   10
C14-1   C   R   14
C14-2   C   R   14
C14-3   C   R   14
C21-1   C   R   21
C21-2   C   R   21
C21-3   C   R   21
D7-1    D   S   7
D7-2    D   S   7
D7-3    D   S   7
D10-1   D   S   10
D10-2   D   S   10
D10-3   D   S   10
D14-1   D   S   14
D14-2   D   S   14
D14-3   D   S   14
D21-1   D   R   21
D21-2   D   R   21
D21-3   D   R   21

SampleTable file:

sampleName  fileName    condition.sampleID  condition.cultivar  condition.phenotype condition.age

1   C10_1.txt   C10_1.txt   C10-1   C   S   10
2   C10_2.txt   C10_2.txt   C10-2   C   S   10
3   C10_3.txt   C10_3.txt   C10-3   C   S   10
4   C14_1.txt   C14_1.txt   C14-1   C   R   14
5   C14_2.txt   C14_2.txt   C14-2   C   R   14
6   C14_3.txt   C14_3.txt   C14-3   C   R   14
7   C21_1.txt   C21_1.txt   C21-1   C   R   21
8   C21_2.txt   C21_2.txt   C21-2   C   R   21
9   C21_3.txt   C21_3.txt   C21-3   C   R   21
10  C7_1.txt    C7_1.txt    C7-1    C   S   7
11  C7_2.txt    C7_2.txt    C7-2    C   S   7
12  C7_3.txt    C7_3.txt    C7-3    C   S   7
13  D10_1.txt   D10_1.txt   D10-1   D   S   10
14  D10_2.txt   D10_2.txt   D10-2   D   S   10
15  D10_3.txt   D10_3.txt   D10-3   D   S   10
16  D14_1.txt   D14_1.txt   D14-1   D   S   14
17  D14_2.txt   D14_2.txt   D14-2   D   S   14
18  D14_3.txt   D14_3.txt   D14-3   D   S   14
19  D21_1.txt   D21_1.txt   D21-1   D   R   21
20  D21_2.txt   D21_2.txt   D21-2   D   R   21
21  D21_3.txt   D21_3.txt   D21-3   D   R   21
22  D7_1.txt    D7_1.txt    D7-1    D   S   7
23  D7_2.txt    D7_2.txt    D7-2    D   S   7
24  D7_3.txt    D7_3.txt    D7-3    D   S   7

DESeq2 differential expression:

dds <- DESeqDataSetFromHTSeqCount(sampleTable = sampleTable, directory = directory, design = ~ )

I am not sure what I should put in the design to achieve my goals. I have tried:

dds <- DESeqDataSetFromHTSeqCount(sampleTable = sampleTable, directory = directory, design = ~ condition.cultivar + condition.phenotype + condition.cultivar:condition.phenotype) , #but I am not convinced

then:

dds <- dds[ rowSums(counts(dds)) > 1, ]
gene_de_comparisons <- DESeq(dds)
resultsNames(gene_de_comparisons)

the result is:

[1] "Intercept"                                       
[2] "condition.cultivar_D_vs_C"       
[3] "condition.phenotype_S_vs_R"                      
[4] "condition.cultivarD.condition.phenotypeS"

the result is not my goal I don't understand why I have this pair of comparisons and I am not sure what is the correct code to achieve my goals.

any help is appreciated. and sorry for being too long in my question I just wanted to provide all the details.

rna-seq • 974 views
ADD COMMENTlink modified 19 months ago • written 19 months ago by Safa.A0
1

hint : you need to include the age in your design to answer your questions (not condition.condition, the actual age without the "C" or "D").

ADD REPLYlink written 19 months ago by Carlo Yague4.4k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1130 users visited in the last hour