I am working with a virally infected cell line. My RNA Seq reads are therefore a combination of host and viral transcripts. I am interested in the viral transcripts that are differentially regulated between the WT and a mutant cell line. I am not sure about the best way to analyze this data.
- Should I align the reads to a combined viral+host genome and then count the reads independently for the virus and host using their respective GTF files and then combine the counts before using as input for deseq2?
- Or if it is better to align the reads to the virus and then align the unmapped reads to the host and proceed as before.
Also, is deseq2 the best way to dge in small genomes or is there a better way to normalize the raw counts before comparing fold change? I am new to RNA Seq and would appreciate any input!