EDIT: OP deleted most of the content of their post just retaining the two sentences below, but I've restored the original content from Google Cache.
- Ram
Hi everyone, Thanks for your time and your help !
ORIGINAL CONTENT:
0 6 months ago am835821 • 0
Hi everyone,
I am analyzing a RIP-seq experiment made of 12 RNA libraries as follows :
6 "control" libraries : 3 input (total RNA) and their corresponding immunoprecipitated RNAs (IP) and 6 "affected" libraries : 3 input (total RNA) and their corresponding immunoprecipitated RNAs (IP)
- Input control 1 - Input control 2 - Input control 3
- IP control 1 - IP control 2 - IP control 3
- Input affected 1 - Input affected 2 - Input affected 3
- IP affected 1 - IP affected 2 - IP affected 3
I would like to analyze on one hand the 3 Input control vs the 3 Input affected and on the other hand the 3 control IP vs the 3 affected IP.
I am starting with a single raw count table of the 12 libraries. My question is : Should I split the table in half at the very beginning, an Input count table and IP count table and then perform all the normalization and DE analysis steps in parallel ? Or should I keep the 12 libraries in the same count table and perform normalization on the whole ?
I tried both and outputs are different unless I'm mistaken. I can not figure out which is the relevant choice. In my opinion, it is not suitable to compare the 6 Input between them from a count table normalized with the whole 12 libraries.
Thanks for your time and your help !
I have no experience with RIP-seq but usually in something like ChIP-seq one only uses inputs for peak calling as they are globally different from the actual IPs so proper normalization is close to impossible for a DE analysis. Is this the case in RIP-seq?
Thanks for you reply, I have no experience in ChIP-seq and I am just starting the analysis workflow so I don't really know if RIP-seq Input are handled the same way as it is in ChIP-seq but maybe I can precise what we would like to get if it helps :
I assume that the Input and IP libraries should be normalized for raw counts separately but if I do it that way, none of the adjusted P values are significant whereas when I normalize on the whole table with both Input and IP libraries, I have like 50 significant deregulated genes.