Question

What is the best approach to obtaining relative expression counts of RIPSeq data?

1

Entering edit mode

8.9 years ago

snamjoshi87 ▴ 40

Approaches for ChIP-Seq data may possibly be applicable so I have included the tag.

My RIP-Seq data has a number of conditions and the same experiment performed in a knockout animal. Both KO and WT have been sequenced. The knockout essentially represents background noise for its corresponding WT sample.

DGE analysis is useful, but I would also like to obtain just the relative counts for all of my aligned data. Using relative counts could be useful because it would tell me "what/how many (relative) transcripts are potentially bound to the protein of interest" rather than "how does the pool of bound transcripts change between experimental conditions".

For a normal RNAseq experiment, the approach would be to normalize the data allowing the counts to be compared between conditions. In my particular case, I have knockouts representing background that must be somehow factored in to the calculation of relative counts. DESeq2 and similar software allow you to set up interaction modeling designs that treat the KO as an interacting term when performing DGE analysis. Is there a similar approach I can use for simply looking at relative counts? If not, how can I obtain relative counts and subtract out any background?

ChIP-Seq RIP-Seq Normalization • 2.9k views

ADD COMMENT • link updated 8.9 years ago by Asaf 10k • written 8.9 years ago by snamjoshi87 ▴ 40

score 2 · Answer 1 · 2016-08-11

2

Entering edit mode

8.9 years ago

Asaf 10k

You can use DESeq2 and define an interaction as a factor. You'll have the WT/KO as one factor, RIP/total as another and interaction as a third factor. The WT/KO is not interesting, the RIP/total will give you the average relative abundance of the RNA on the protein and the interaction will tell you how this relative abundance changed. Since RIP/total can't physically be larger than 1 you should scale the values you will get using the highest (maybe the 99th percentile will be a better choice) value.
Another, more straightforward approach would be to not normalize the libraries and just divide the number of reads in the RIP library by number of reads in the reciprocal total RNA library. Scale these values between 0 and 1 (again, using the 99th percentile to avoid outliers) and you will get relative abundance in each condition.

ADD COMMENT • link 8.9 years ago by Asaf 10k

0

Entering edit mode

How do I extract the counts from your first suggestion? If I model with DESeq2 the counts returned (basemean) do not take the model/interactions into account.

ADD REPLY • link 8.8 years ago by snamjoshi87 ▴ 40

0

Entering edit mode

I didn't understand your question. You have 4 types of experiments with 2 factors, just get a count table for each experiment and enter to DESeq2 with the appropriate formula.

ADD REPLY • link 8.8 years ago by Asaf 10k

0

Entering edit mode

Isn't using DESeq2 just going to give me a list of differentially expressed genes and fold changes? That's a separate question. I simply want to know: "What RNAs come down with my protein of interest in a given condition after subtracting out the background and what are the normalized counts". DESeq2 tells me: "What RNAs are differentially expressed between two conditions". Your second suggestion seems to be what I am looking for. I was just unsure how using DESeq2 in your first suggestion would give me relative counts. You say to scale the values from RIP/total from 0 to 1. What values? The counts DESeq2 reports (basemean) are just averages - they don't take into account the terms in the formula. Is this a little more clear?

ADD REPLY • link 8.8 years ago by snamjoshi87 ▴ 40

0

Entering edit mode

DESeq2 gives you a lot more than just DE genes. It will eventually give you the coefficient of each factor, including the interaction.

ADD REPLY • link 8.8 years ago by Asaf 10k