Hi all. We have just generated some PRO-seq data (https://doi.org/10.1038/nprot.2016.086) and followed the instructions regarding adding Drosophila nuclei at a fixed percentage prior the run-on reaction. The baseline sample looks good -- for 2% Drosophila nuclei, for example, I ended up having 47794146 reads aligned to the human genome, and 1516945 reads to Drosophila (so ~3.2%). There are two other timepoints following my treatment, where despite the total number of reads and spike-in percentage being about the same, the amount of Drosophila-mapping reads increases fairly dramatically (up to almost 30% for our final condition). I think our intervention results in a lost of Pol II release/active transcription, so this kinda makes sense.
Analyzing this stuff is new territory for us though. I have seen posts for handling spike-in normalization for ChIPseq, RNAseq etc but I'm not sure the same applies to analyzing run-on sequencing? Does anyone have experience or advice regarding how to apply this if I'm not using the standard analysis packages (like DEseq2 etc.) Is it as simple as scaling (using something like bedtools or deeptools) with factors based on the proportion of Drosophila reads in experimental vs control conditions?
thanks! -Lynn