I have been given some RNA-seq data from a human cell line infected with a virus that can be activated. Upon activation, we see a large drop of reads mapping to the human genome (it goes from from ~75% to %40 upon activation). There is a large shift from human to virus expression upon activation. We decided to map the data to a combined genome (human + virus) and get the counts from both.
I am stuck deciding how to do the DE analysis. We are looking at the DE in host and viral genome. We have triplicates of each condition. Since we have a global shift in expression from host to virus what is the best way to normalize these? My initial thought was to import the counts into DESeq2 following the recommendation from Devon Ryan's response to this Biostars question
Would we be violating the assumption that there is no global change?
There is another question similar to this one addressed here by Carlo Yague. It looks like the best normalization would be using spike-ins in this case. Unfortunately, this data does not have spike-in data.
What is the best way to proceed? What are your thoughts?