Best approach to compare human and mouse gene expression quantified by RNA-seq
Entering edit mode
5.1 years ago
samuelrivero ▴ 50


I know it is a very general question. I would like to compare expression (RNA-seq) data from human and mouse. Basically I have a big cohort of RNA-seq data from human tumors (different subgroups). Also, I have RNA-seq data from a mouse tumor model and I would like to compare if the tumors generated in mice clustered together with any of the human subgroups.

My questions is about any idea of normalization of the data between the human and mouse RNA-seq.

I have mapped reads to the respective genomes and quantified gene expression, I merged the expression data of orthologous genes between human and mouse. Clustering the data as it is, it's hard to get any close cluster between human and mouse. So I think that some normalization should be applied.

I have tried to look at the bibliography but the approaches are very disparate (from no normalization at all to several steps of normalization).

I am looking for any suggestion of a standard way to do this.

Thank you in advance.

RNA-Seq orthologous gene expression • 2.7k views
Entering edit mode

If I were asked to do it, my approach would be:

  1. count read abundances over comparable versions of GENCODE annotation for GRCh and GRCm
  2. filter the raw counts for orthologous genes
  3. normalise data and log transform
  4. convert logged data to the Z scale
  5. compare

You don't have my permission to state that this approach is in any way valid, though.

An addition: after step 3, modelling differences between both datasets and then adjusting for these may be more appropriate than Z-scaling. However, this may inadvertently wipe out whatever effect you may be modelling between mouse and human.


Login before adding your answer.

Traffic: 1411 users visited in the last hour
Help About
Access RSS

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6