I like RSEM and would like to use it for RNA-seq analysis from xenograft (human tumor cells grown in mouse) samples. I have several questions:
1) In a few papers I read, the authors mapped RNA-seq reads from xenografts to an augmented (merged) human+mouse genome but I also read some forums where it was suggested to map the reads to the human and mouse genomes independently / in parallel. So I was wondering what are the advantages and disadvantages of the two approaches (mapping to an augmented genome VS parallel mapping to human and mouse genomes independently).
2) When an augmented genome is used, it's been suggested that multi-mapped reads (those mapping to both human and mouse) are removed prior to expression quantification. On the other hand, if I understand correctly, RSEM assigns multi-mapped reads proportionally to all their mapped locations, and there does not appear to be an option to only consider uniquely mapped reads and discard multi-mapped reads. Does anyone knows if that's the case?
Not a direct answer to your questions, but this blog post discusses a bit your concerns.
Use featureCounts or HTSeq, their default behaviour is to discard multi-mapped reads.