Is there any method of estimating number of viral genomes per host genome using whole exome sequencing data. I am specifically interested in HPV which integrates to host genome in head and neck cancer and also in cervical cancer.
It sounds like the most straightforward approach would be to map the reads to the genome and viral genomes simultaneously, then calculate the coverage ratio between host and virus. However, there's no reason to expect viral DNA in exome sequencing because the baits wouldn't be designed for it, so I think you'd need whole-genome sequencing. Exome-capture messes up abundance calculations anyway due to differential bait efficiency, so even if you designed baits for your viruses, you'd need calibration.