Question

Correlate phenotype and gene expression data

0

Entering edit mode

6.8 years ago

S_B_P • 0

I have experimental quantitative data of an immune phenotype (as a continuous variable) for 7 inbred mouse strains. From Immunological genome project immgen.org), I have obtained raw data for immune cell gene expression, from which I can calculate gene expression differences across any pair of mouse strains. From the 7 pairs of strains, I generate C(7,2) = 21 possible pairs. For each of the 21 pairs, I calculate the difference in phenotype (expressed as absolute difference or ratio). From the same set of 21 pairs I generate gene expression differences for each pair. How can I correlate the 21 'phenotype' measures and 21 'gene-expression' measures to come up with a potential list of genes that could be associated with the phenotype ? Is there a statistical method or gene set enrichment tool that can pick up quantitative differences in groups of gene sets such as this ?

gene expression RNA-Seq phenotype R microarray • 2.6k views

ADD COMMENT • link updated 6.8 years ago by Jean-Karim Heriche 27k • written 6.8 years ago by S_B_P • 0

score 3 · Accepted Answer · 2017-07-11

3

Entering edit mode

6.8 years ago

Jean-Karim Heriche 27k

I don't see where you're going with computing all these pairs.
Identifying the genes contributing to the phenotype can be seen as a regression problem. You want to regress your phenotype Y on a linear combination of the gene levels Xi. See for example this paper. For this, you would use the strains as individuals.

ADD COMMENT • link 6.8 years ago by Jean-Karim Heriche 27k