Question: Differential Expression Using Different Libraries (Truseq, Nextera)?
2
gravatar for Nick
5.3 years ago by
Nick250
Spain
Nick250 wrote:

I have 3 replicates for 2 conditions (6 samples) which have been sequenced using different protocols (nextera, truseq). Do I just need to use the library type as a blocking factor when defining the model for the differential expression?

I am planning to use edgeR but, I reckon, the same logic would apply to DeSeq, too.

rna-seq • 3.4k views
ADD COMMENTlink modified 5.3 years ago by Michele Busby1.9k • written 5.3 years ago by Nick250
1

Cross-posted here (and as I mentioned there, using the library type as a factor would indeed be the normal solution).

ADD REPLYlink written 5.3 years ago by Devon Ryan88k
4
gravatar for Michele Busby
5.3 years ago by
Michele Busby1.9k
United States
Michele Busby1.9k wrote:

It is difficult to know specifically what will happen with two different protocols unless you sequence the same sample with both methods. We do that a lot with K562 samples but I don't seem to have a Nextera sample handy to check for you.

My guess is that if you prepared the exact same sample with Nextera and TruSeq you would get a higher variance than if you compared Nextera and Nextera. Some protocols are very close to one another but others do not look like the same sample. e.g. http://michelebusby.tumblr.com/image/62718357939 Since the variance may be raised with the TruSeq data included you might not actually get a very great increase in power by adding the third replicate, and you could even lose power.

If you make scatter plots with each sample compared to one another would be the first place to look.

There is usually bias in the genes that have more variability in samples, i.e. it's non-random noise. Usually the high gc genes vary more by protocol than low GC and sometimes it is the short reads that bounce around more.

If you put all three samples through EdgeR you might get a screwy variance fit. EdgeR and DeSeq both use a uniform or quasi-uniform variance calculation, which means they basically say all genes at a given depth have the same variance. But then the call is based on the difference in the means. The means might bounce around more for some genes so I would expect you to be introducing some bias into what you are calling. Without an experiment looking at Nextera vs TruSeq it is difficult to correct for that bias in downstream analyses.

I might devise a different design where you put the Nextera samples through the EdgeR by themselves and then confirm the direction of the calls against the TruSeq data separately. You'd have to think on the stats but I think that would use all the information without introducing too much bias in your calls.

Edit: Joshua Levin publishes this type of work a lot. If you look at his papers they give a good overview of what happens with different protocols, e.g. http://www.nature.com/nmeth/journal/v7/n9/abs/nmeth.1491.html http://www.nature.com/nmeth/journal/v10/n7/full/nmeth.2483.html

ADD COMMENTlink modified 5.3 years ago • written 5.3 years ago by Michele Busby1.9k
2

Nice work on the response. Bonus for citing relevant literature.

ADD REPLYlink written 5.3 years ago by David Quigley11k
1
gravatar for Rory Kirchner
5.3 years ago by
Rory Kirchner10 wrote:

How are the different preps distributed? Is one condition made with one kit and the other with another kit? Or are some of the replicates from each condition made with the different kits?

ADD COMMENTlink written 5.3 years ago by Rory Kirchner10

Each kit is used for equal number of control/treatment samples, i.e. truseq for one treatment + one control, nextera for 2 treatments and 2 controls.

ADD REPLYlink written 5.3 years ago by Nick250

Great-- adding it as a blocking factor is the way to go then. I would look at a MDS plot of the samples and see how they clustered together; if it looked like the library prep wasn't influencing the clustering at all, I'd also consider dropping it from further analyses. I'd also take a look at any genes that are called DE between the two library preps too, you might be able to glean some information regarding how the different preps are affecting your experiment.

ADD REPLYlink written 5.3 years ago by Rory Kirchner10
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 2203 users visited in the last hour