Deseq2 multiple variables!
1
0
Entering edit mode
4.3 years ago

Hi there, I have some doubt in my analysis. I have several variables in rnaseq analysis and i want to get the DEGs.

sample  experiment  typeCell    batch

SD1-1_S6    C   SN1 1

SD2-2_S13   C   SN1 1

SD3-1_S4    C   SN1 1

SD4-1_S17   C   SN1 1

S-SC1-1_S8  C   MN  1

S-SC2-1_S16 C   MN  1

S-SC3_S11   C   MN  1

S-SC4-1_S14 C   MN  1

TD1-2_S18   L   SN1 1

TD2-1_S9    L   SN1 1

TD3-1_S2    L   SN1 1

TD4-1_S1    L   SN1 1

TD6-1_S3    L   SN1 1

T-SC2-1_S5  L   MN  1

T-SC3-1_S12 L   MN  1

T-SC4-1_S10 L   MN  1

T-SC5_S15   L   MN  1

T-SC6_S7    L   MN  1

SCI-C-2-S3  L   PN  2

SCI-C-4-S6  L   PN  2

SCI-C-5-S7  L   PN  2

SCI-C-6-S17 L   PN  2

SCI-C-7-S11 L   PN  2

SCI-DL-1-S8 L   SN2 2

SCI-DL-2-S12    L   SN2 2

SCI-DL-4-S10    L   SN2 2

SCI-DL-6-S4 L   SN2 2

SCI-DR-5-S18    L   SN2 2

SHA-C-4-1-S13   C   PN  2

SHA-C-6-S15 C   PN  2

SHA-C-7-S16 C   PN  2

SHA-C-8-S5  C   PN  2

SHA-DL-1-S14    C   SN2 2

SHA-DL-4-S2 C   SN2 2

SHA-DR-5-S1 C   SN2 2

SHA-DR-8-S9 C   SN2 2

So, if i want to compare only the C (control) in the cells MN and and SN1 i did:

data2<-data[data$experiment=="C",]

data2<-data2[data2$typeCell %in% c("SN1","MN"),]

data2$batch<-NULL

table2<-table[,grep(pattern=".+SHAM\\_MN.+|PNI\\-SHAM\\_SN.+",x=colnames(table))]

dds3<-DESeqDataSetFromMatrix(countData = table2, colData = data2, design= ~typeCell)

dds3<-estimateSizeFactors(dds3, controlGenes=index)

dds3<-DESeq(dds3)

I am asking if this correct filter for the conditions before doing the normalization or i need to filter after the normalization?

Note: I did this approach because doing the filter after the normalization i got an error "Error in checkFullRank(modelMatrix) : ..." i tried to check for redundant columns but i still have the error!

Thanks in advance for your time.

Best Regards, Andreia

deseq2 R • 2.0k views
ADD COMMENT
0
Entering edit mode

You probably got the error because the batch is redundant since batch 1 and 2 have different cell types (bad experiment design). Try to remove the batch and try again. You can do the normalization either way but if you think that the gene expression is somewhat similar in all cell types you should normalize using all the samples, it will give a better estimate of expression variance

ADD REPLY
0
Entering edit mode

I removed the column and i have got the same error. :(

ADD REPLY
0
Entering edit mode

Can you add what you tried and how it failed?

ADD REPLY
0
Entering edit mode
4.3 years ago

If you want to compare two subsets of samples to each other...don't do it like this. I think some of these chopping steps are wrong, and that's why you have an error.

Make a new column that has experiment concatenated with celltype.

Make dds with all the data. If you really have don't want all the samples normalized together, (lots of the time, you do want all the samples normalized together, even the ones you aren't directly comparing) don't do it by chopping up your input files. Make new dds objects, like

dds_keep <- dds[,colnames(dds) %in% keep]

or

dds_mytissue <-dds[ ,dds$Tissue %in% c('mytissue')]

You might need some dropLevels commands to clean up unused design factors.

I also strongly recommend you not just run the DESeq command like that. Specify the contrasts you want. The idea is for it to be as easy as possible for you to figure out what you did 6 months from now. To compare two subsets to each other, use the concatenated column as the design, and specify what you want with contrasts in the DESeq command.

ADD COMMENT

Login before adding your answer.

Traffic: 2029 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6