Question

Best approach to compare human and mouse gene expression quantified by RNA-seq

1

Entering edit mode

7.9 years ago

samuelrivero ▴ 50

Hi,

I know it is a very general question. I would like to compare expression (RNA-seq) data from human and mouse. Basically I have a big cohort of RNA-seq data from human tumors (different subgroups). Also, I have RNA-seq data from a mouse tumor model and I would like to compare if the tumors generated in mice clustered together with any of the human subgroups.

My questions is about any idea of normalization of the data between the human and mouse RNA-seq.

I have mapped reads to the respective genomes and quantified gene expression, I merged the expression data of orthologous genes between human and mouse. Clustering the data as it is, it's hard to get any close cluster between human and mouse. So I think that some normalization should be applied.

I have tried to look at the bibliography but the approaches are very disparate (from no normalization at all to several steps of normalization).

I am looking for any suggestion of a standard way to do this.

Thank you in advance.

RNA-Seq orthologous gene expression • 3.2k views

ADD COMMENT • link 6.9 years ago by samuelrivero ▴ 50

1

Entering edit mode

If I were asked to do it, my approach would be:

count read abundances over comparable versions of GENCODE annotation for GRCh and GRCm
filter the raw counts for orthologous genes
normalise data and log transform
convert logged data to the Z scale
compare

You don't have my permission to state that this approach is in any way valid, though.

An addition: after step 3, modelling differences between both datasets and then adjusting for these may be more appropriate than Z-scaling. However, this may inadvertently wipe out whatever effect you may be modelling between mouse and human.

ADD REPLY • link 6.8 years ago by Kevin Blighe 89k