Question: check deseq2 design please
0
gravatar for Davide
11 days ago by
Davide10
Davide10 wrote:

I have a metadata file as below and a matrix count file. I would like to run a differential expression using DeSEQ2 to find differentially expressed genes taking into account Age and Gender. (A=Young,B=Middle Age, C=Old)

My Question is can someone check my code and correct where necessary exactly what needs to be changed.

Thank you very much in advance for your time and help.


library("DESeq2")

b<-read.table("ARGS_GA", as.is=TRUE,header=TRUE, sep="\t") a<-read.table("matrix_countA", header=TRUE, row.names=1,sep="\t") dds<-DESeqDataSetFromMatrix(countData =a,b,design=~ condition + Gender + Age) dds <- dds[ rowSums(counts(dds)) >= 2, ] dds<-DESeq(dds) res<-results(dds, contrast=list("condition_CONTROL_vs_CASE"))

saveRDS(res, file="case_control.RData")

resOrdered <- res[order(res$padj),] saveRDS(resOrdered, file="case_control_ordered.RData")

sig <- resOrdered[!is.na(resOrdered$padj) & resOrdered$padj<0.10 & abs(resOrdered$log2FoldChange)>=1,]

saveRDS(sig, file="case_control.30sig2.RData")

rna-seq deseq2 • 206 views
ADD COMMENTlink modified 5 days ago • written 11 days ago by Davide10
2

How To Ask Good Questions On Technical And Scientific Forums

ADD REPLYlink written 11 days ago by Asaf6.5k

Sorry I am new to posting.

Thank you

ADD REPLYlink written 11 days ago by Davide10
1

Biostars is not a code review forum. If you have a specific question or error, we can probably help you out, but throwing a big hunk of code at people with little context is unlikely to get you the help you require.

You should typically put the variable you're interested in last in your design, so if you're interested in differences due to condition while accounting for differences from Age and Gender, your design would be ~Age + Gender + condition.

Also, sig and sig2 are the same thing here, so not quite sure the purpose of that bit.

ADD REPLYlink written 11 days ago by jared.andrews073.9k
1

Thank you,

DESeqDataSetFromMatrix(countData =a,b,design=~ Age + Gender + condition)

I would like someone to scan the code to see is there anything wrong with my approach/code ?

Thank you

ADD REPLYlink modified 5 days ago by RamRS24k • written 11 days ago by Davide10
  • Why are you using a cut-off of adjusted p<0.1?
  • You do not define sig2 anywhere in your code
  • Why do you feel the need to adjust for age and gender? - what evidence have you that they are confounding factors?
ADD REPLYlink modified 5 days ago • written 5 days ago by Kevin Blighe51k

Thanks Kevin I made the correction to my code.

When I run PCA on the normalized counts then Gender appeared to separate the data of controls + cases.

Just to clarify, would the following give results that already takes care of the gender and age based on the design

DESeqDataSetFromMatrix(countData =a,b,design=~ Age + Gender + condition)

or do I need to run something more to get differential genes that has the effects of gender and age removed?

res<-results(dds, contrast=list("condition_CONTROL_vs_CASE"))

Thank you very much for your time and help

ADD REPLYlink modified 5 days ago by RamRS24k • written 5 days ago by Davide10

Yes, as per Asaf's comment

ADD REPLYlink written 5 days ago by Kevin Blighe51k

You should typically put the variable you're interested in last in your design

Out of curiosity, why ? I always thought that the order of variables does not matter.

ADD REPLYlink written 5 days ago by Carlo Yague4.8k
1

Clarity, and the results() command will use contrasts for the last variable by default. You're right that it makes no functional difference though, you can get any comparison you want from those variables via proper results() calls.

ADD REPLYlink written 5 days ago by jared.andrews073.9k

Sorry but can someone check my design? I am still unclear on exactly how to identify the differentially expressed genes taking into account Age and Gender

Thank you again in advance for your time and help

ADD REPLYlink written 5 days ago by Davide10

Your design is reasonable. It would be better to post only the design, you can't expect users to comment on such a long segment of code.

ADD REPLYlink written 5 days ago by Asaf6.5k

Thanks Asaf

Just to clarify, so the following design

DESeqDataSetFromMatrix(countData =a,b,design=~ Age + Gender + condition)

would give me the differential genes that has the effects of gender and age removed using this code:

res<-results(dds, contrast=list("condition_CONTROL_vs_CASE"))

Or do I need to run something else to get differential genes that has been corrected for Age and Gender?

Thank you very much for your time and help

ADD REPLYlink modified 5 days ago by RamRS24k • written 5 days ago by Davide10

Yes, this is the way to get the experiment effect, Age and Gender effects removed.

ADD REPLYlink written 5 days ago by Asaf6.5k

Hi,

There is no need to use <pre><code> for code statements. Yoou can use the formatting bar instead: there are backticks for inline code (`text` becomes text), and there is the highlighted button to format a chunk of text as a code block.

code_formatting

ADD REPLYlink written 5 days ago by RamRS24k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1484 users visited in the last hour