Question: How to analyze the RNAseq paired samples with DESeq2?
0
gravatar for Kenneth
10 months ago by
Kenneth0
Kenneth0 wrote:

I am a graduate student who works with the RNAseq data for the first time. I have been stuck on DESeq2 for quite a while so it would be great to have some feedback on my analysis! My question is: how to find the condition effect while controlling the participant effect?

This is what we have done:

  • Our study delivers 3 different conditions and 1 control treatment for each participant (13 total).
  • We sequenced the RNA from the samples
  • FastQC indicates a very good quality of the fastq data
  • I then used Salmon to align and assembly against the Ensembl transcriptome 98
  • I read the DESeq2 vignette and this tutorial, and I am able to reproduce the results
  • Now I attempted to run DESeq2 on my data to find the differential expressed genes
  • I used dds <- DESeqDataSetFromTximport(txi, colData = coldata, design = ~participant+condition) to create the dds object. Then I simply ran dds <- DESeq(dds) and res <- results(dds)
  • sum(res$padj < 0.1, na.rm=TRUE) return 4
  • sum(res$pvalue < 0.1, na.rm=TRUE) return 1448
  • I tried to use 'contrast' for other conditions comparison but they are very similar to this example

This is the PCA plot of the paired data:

PCA plot of the paired data

This is the results output and p-value histogram: enter image description here enter image description here

From plotDispEsts(dds)

enter image description here

Summary of the results

enter image description here

The four genes from above enter image description here

rna-seq R • 930 views
ADD COMMENTlink modified 9 months ago • written 10 months ago by Kenneth0
1

Did you have one replicate for each condition? Do all the participants have the same sex? If two or three conditions are similar I would combine them and test against the control to increase power.

ADD REPLYlink written 10 months ago by Asaf8.5k

I have 13 different participants for each condition and they are 50/50 male and female

ADD REPLYlink written 10 months ago by Kenneth0
1

So what is your question exactly? "I don't know what to do now" is very vague; is there a specific problem you're stuck with?

ADD REPLYlink written 10 months ago by khorms210

I want to learn how to find the differentially expressed genes in this case, but I don't know how to proceed now!

ADD REPLYlink written 10 months ago by Kenneth0
1

Time for a different approach: If I were you, I would start to generate some sample to sample plots to show that there is not much 'distance' between the samples, hence the poor results. You can't just say it didn't work, you have to show why. They would complement the PCA plot.

ADD REPLYlink written 10 months ago by Barry Digby580

I tried the hcluster package to get some plots, and it showed that the samples cluster within each participant as expected. But they don't cluster based on the condition. Is there a way to generate a cluster plot base on condition while accounting the differences between participants? Thanks!

ADD REPLYlink written 10 months ago by Kenneth0

No. Clustering is unsupervised...

ADD REPLYlink written 10 months ago by Barry Digby580
1

What is your actual question? You've not provided us with any summary of your results: how many genes were differentially expressed for a typical treatment-vs-control comparison?

ADD REPLYlink written 10 months ago by russhh5.5k

Sorry I didn't make it clear. I updated the post with the results summary. My question is how can I find the differential expressed genes based on what I have? Thanks for your time!

ADD REPLYlink written 10 months ago by Kenneth0

You can use CSI NGS Portal for a full Differential Expression analysis using DESeq2.

ADD REPLYlink written 10 months ago by Ă–mer An220

If you aim to promote your tool, please make a Tool post. OPs question is not about this. I will therefore move it to comment. Please do not advertise your tool if it doesn't add to the question.

ADD REPLYlink written 10 months ago by ATpoint42k
1
gravatar for i.sudbery
10 months ago by
i.sudbery9.8k
Sheffield, UK
i.sudbery9.8k wrote:

It would be easy to conclude from this analysis that there is no effect of your condition. However, I am slightly drawn by the fact your p-value histogram skews towards one. In the case that there was genuinely no effect of the treatment - that is all the null hypotheses were true, we'd expect to see a flat p-value histogram.

It also looks like you've got a coherent group of genes that have a very high variance compared to the others. which I guess could be doing something to the variance fit, and making the model conservative? What are your mapping rates? You can find this by looking in the log files that come out of salmon?

Also, what do your library sizes look like?

ADD COMMENTlink written 10 months ago by i.sudbery9.8k

Thanks for your kind help! Some other p-value histogram shows a left-skewed like this

enter image description here

My mapping rates are about 85~92% and the average library size is 49 million

ADD REPLYlink written 10 months ago by Kenneth0
0
gravatar for swbarnes2
10 months ago by
swbarnes29.2k
United States
swbarnes29.2k wrote:

At first glance at your PCA, I'd say your condition doesn't do anything.

ADD COMMENTlink written 10 months ago by swbarnes29.2k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1150 users visited in the last hour