Question: What is the best approach to obtaining relative expression counts of RIPSeq data?
gravatar for snamjoshi87
3.3 years ago by
snamjoshi8740 wrote:

Approaches for ChIP-Seq data may possibly be applicable so I have included the tag.

My RIP-Seq data has a number of conditions and the same experiment performed in a knockout animal. Both KO and WT have been sequenced. The knockout essentially represents background noise for its corresponding WT sample.

DGE analysis is useful, but I would also like to obtain just the relative counts for all of my aligned data. Using relative counts could be useful because it would tell me "what/how many (relative) transcripts are potentially bound to the protein of interest" rather than "how does the pool of bound transcripts change between experimental conditions".

For a normal RNAseq experiment, the approach would be to normalize the data allowing the counts to be compared between conditions. In my particular case, I have knockouts representing background that must be somehow factored in to the calculation of relative counts. DESeq2 and similar software allow you to set up interaction modeling designs that treat the KO as an interacting term when performing DGE analysis. Is there a similar approach I can use for simply looking at relative counts? If not, how can I obtain relative counts and subtract out any background?

rip-seq chip-seq normalization • 1.3k views
ADD COMMENTlink modified 3.3 years ago by Asaf6.5k • written 3.3 years ago by snamjoshi8740
gravatar for Asaf
3.3 years ago by
Asaf6.5k wrote:

You can use DESeq2 and define an interaction as a factor. You'll have the WT/KO as one factor, RIP/total as another and interaction as a third factor. The WT/KO is not interesting, the RIP/total will give you the average relative abundance of the RNA on the protein and the interaction will tell you how this relative abundance changed. Since RIP/total can't physically be larger than 1 you should scale the values you will get using the highest (maybe the 99th percentile will be a better choice) value.
Another, more straightforward approach would be to not normalize the libraries and just divide the number of reads in the RIP library by number of reads in the reciprocal total RNA library. Scale these values between 0 and 1 (again, using the 99th percentile to avoid outliers) and you will get relative abundance in each condition.

ADD COMMENTlink written 3.3 years ago by Asaf6.5k

How do I extract the counts from your first suggestion? If I model with DESeq2 the counts returned (basemean) do not take the model/interactions into account.

ADD REPLYlink written 3.3 years ago by snamjoshi8740

I didn't understand your question. You have 4 types of experiments with 2 factors, just get a count table for each experiment and enter to DESeq2 with the appropriate formula.

ADD REPLYlink written 3.3 years ago by Asaf6.5k

Isn't using DESeq2 just going to give me a list of differentially expressed genes and fold changes? That's a separate question. I simply want to know: "What RNAs come down with my protein of interest in a given condition after subtracting out the background and what are the normalized counts". DESeq2 tells me: "What RNAs are differentially expressed between two conditions". Your second suggestion seems to be what I am looking for. I was just unsure how using DESeq2 in your first suggestion would give me relative counts. You say to scale the values from RIP/total from 0 to 1. What values? The counts DESeq2 reports (basemean) are just averages - they don't take into account the terms in the formula. Is this a little more clear?

ADD REPLYlink written 3.2 years ago by snamjoshi8740

DESeq2 gives you a lot more than just DE genes. It will eventually give you the coefficient of each factor, including the interaction.

ADD REPLYlink written 3.2 years ago by Asaf6.5k
Please log in to add an answer.


Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 786 users visited in the last hour