Question: DESeq2 DEG analysis using different sequencing depth data
0
gravatar for woongjaej
5 weeks ago by
woongjaej10
woongjaej10 wrote:

Hi, guys.

I have a question analysing RNA-seq data(I'm using DESeq2)

I'm willing to use 8 samples with high sequencing depth, and 6 samples with low sequencing depth.(About 4times lower).

Can I use these data to analysis in DESeq2?? Does DESeq2 normalizes these samples' count for using??

If I can't, could somebody point the direction to the method I can use all these samples??

Any help will be very very saving me... Thanks...

Best, Woongjae

ADD COMMENTlink modified 4 weeks ago by h.mon21k • written 5 weeks ago by woongjaej10
1

Things to consider:

  • how low are the low depth samples? Do they reach at least 10 million reads? Are they less than 1-2 million reads? How high are the high depth?

  • are all the samples from the same library preparation / sequencing batch? Different batches? Why some samples with high and other with low depth? Bad RNA quality? Ribosomal RNA contamination?

  • are high and low depth samples randomly distributed, or all high are from one treatment, and all low from another treatment?

ADD REPLYlink written 5 weeks ago by h.mon21k

Hi h.mon!

  1. Reads of high and low depth samples are about 100million and 10million reads each.
  2. The library preparation of the samples were performed 2times. Like experiment 1, experiment2. Their designs are all same, only used sample is different(ex. tissue of different mouse, but same condition, same library kit, same age, same gender)
  3. Samples of each depth are distributed equally. Like 3 controls and 3 treatments at low depth, and 4 controls and 4 treatments. The experiments were performed twice because we thought we needed more experiment. But at the second experiment, we decided to sequence more

Thanks for your help h.mon!!

ADD REPLYlink written 4 weeks ago by woongjaej10

I use edgeR rather than DESeq2, but I know they are pretty similar. edgeR will normalize for sequencing depth (using TMM method by default). I'm sure DESeq2 also uses a similar step during a standard workflow. Does DESeq2 use raw read counts, or something like RPKM? edgeR uses raw counts, which is why it performs TMM normalization. If you use RPKM, then it is already normalized for sequencing depth, in addition to gene length.

ADD REPLYlink modified 5 weeks ago • written 5 weeks ago by goodez390

Thanks goodez!! DESeq2 uses raw count number at the begining of the process. Maybe DESeq2 also uses similar method to normalize the raw counts.But I'm not sure of it and my data of two sequencing depth groups seems to have different FPKM and very different normalized count number. So I'm looking forward to get cofirmed by some DESeq2 experts. I'll try edgeR,too. Thank you very much for the reply!!

ADD REPLYlink written 5 weeks ago by woongjaej10

I would try a test run where you downsample the big ones down to the coverage of the low ones. See if that looks drastically different than using all the data together.

ADD REPLYlink written 5 weeks ago by swbarnes24.5k

Hi swbarnes2

I perfomed differential expression analysis in two groups.(1. low depth samples 3 vs 3 deg analysis 2. high depth samples 4vs4)

It seems there is no drastical difference, but the fdr of some genes have improved. For example, if I use just high depth samples to analysis deg, some genes' fdr values are over 0.05. But when I use all the samples together, fdr gets below 0.05.Fold changes do not seem to change that much.

ADD REPLYlink written 4 weeks ago by woongjaej10

I'm not sure why you responded to me to say you did not do what I suggested...

ADD REPLYlink written 4 weeks ago by swbarnes24.5k

Sorry, I have misunderstood your comment. I'll try to ru your suggestion. Thanks

ADD REPLYlink written 4 weeks ago by woongjaej10
1
gravatar for h.mon
4 weeks ago by
h.mon21k
Brazil
h.mon21k wrote:

The DESeq2 count normalization thread at the BioConductor forum has a lot of useful information for you.

As your low and high depth samples (and library prep batch) are balanced between the treatments, I think you can use them together and let DESeq2 size factor normalization take care of the issue. However, you have a batch effect in your experiment. Examine the PCA plot, depending on how your samples group, you may want to introduce batch in the design formula to take it into account when testing for treatment effects.

ADD COMMENTlink written 4 weeks ago by h.mon21k

Thank you h.mon for your kind replies!

Woongjae

ADD REPLYlink written 4 weeks ago by woongjaej10
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1557 users visited in the last hour