Question: How to identify human samples with a specific mouse gene expression signature?
gravatar for Avro
5.9 years ago by
Avro140 wrote:

Hi everyone,

I have a mouse model with a specific pathology (7 mice total). All 7 mice have gene expression data (Agilent). I would like to identify human patients who resemble the most like this mouse model in terms of gene expression. I have several human samples with gene expression (Illumina). I have two questions:

1) How can I generate a "general gene signature" for the murine model with 7 mice? 

2) How can I test each human sample with this mouse signature? Importantly, how can I establish a threshold? Moreover, would different platforms cause problem?

I know that there must be some good tools out there (e.g. Bioconductor, Broad Institute...), but I am new with gene expression. Any suggestion would be greatly appreciated.

Thank you.


ADD COMMENTlink modified 5.9 years ago by Devon Ryan96k • written 5.9 years ago by Avro140

Did you at least run mice without the pathology on the same platform?

ADD REPLYlink written 5.9 years ago by Devon Ryan96k


Thank you for answering me back. I am not sure if it answers your question: my lab used Agilent for all mice (regardless of pathology). The human data comes from another lab. 

ADD REPLYlink written 5.9 years ago by Avro140

Right, but do you literally only have data for the 7 pathological mice? Normally you would generate data for pathological and non-pathological mice so you can at least see how the pathological state might be changing things (otherwise, the generated "signature" is pretty useless). While you can probably download a similar non-pathological sample from GEO, you'd then have a batch effect to deal with.

ADD REPLYlink written 5.9 years ago by Devon Ryan96k


Sorry for the confusion. Yes, we do have gene expression for non-pathological mice (we have 4 mice). Thank you!

ADD REPLYlink written 5.9 years ago by Avro140
gravatar for Devon Ryan
5.9 years ago by
Devon Ryan96k
Freiburg, Germany
Devon Ryan96k wrote:

The likely simplest method is to use GSEA and MSigDB to try to define gene sets that are enriched in your pathological vs. control samples. You can then survey a diverse sampling of likely affected and unaffected patients and look for similar patterns (have a look at the distribution of enrichment scores for setting a threshold). With every new platform you'll be starting from scratch. Once you have some patient samples with one platform that you think are good examples of pathological/control, save them and run them on any new platforms.

ADD COMMENTlink written 5.9 years ago by Devon Ryan96k

Thank you. As for the distribution of enrichment scores, is it a normal distribution? How would I set a threshold? Is there introduction documentation that I could read? 

Thank you once again. 

ADD REPLYlink written 5.9 years ago by Avro140

Ideally the distribution will be bimodal. If you do end up observing a normal distribution then either (A) none of the patient samples are pathological, (B) the mouse signature doesn't match what you should see in patients (it can happen), or (C) the pathological state represents the extreme of normal variance. Case (C) will be difficult to deal with since you'll need especially compelling clinical data to get anyone to believe that the top whatever percent are actually pathological.

ADD REPLYlink written 5.9 years ago by Devon Ryan96k
Please log in to add an answer.


Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 886 users visited in the last hour