Question: Correlation between gene expression and methylation
1
gravatar for tujuchuanli
2.3 years ago by
tujuchuanli80
tujuchuanli80 wrote:

I have a list of genes and want to test whether the expression level of genes in this list could correlate with DNA methylation level. I verify my hypothesis in TCGA breast cancer. below is my plan

planA:

  1. Extract the expression and methylation level for each gene in my list. Expression can be defined as RPKM from RNA-seq data and methylation level from probe in the promoter region of this gene (from -3kb to 500bp around TSS. if there are multple probes in this region, I prefer to average these probe values as final methylation level value for this gene).
  2. calculating the correlation between these two data (eg. pearson correlation coefficient). if the P-value is significant I can say that there is a significant correlation between these two data.

planB:

  1. Calculating Z score of gene expression for each gene (z score as (value - mean normal)/SD normal).
  2. Calculating Z score of methylation level for each gene (z score as (value - mean normal)/SD normal). from -3kb to 500bp around TSS. if there are multple probes in this region, I prefer to average these probe values. then to calculate Z score.
  3. calculate the correlation coefficient just as metioned above.

which could be better? if you have suggestions please tell me.

Thanks

ADD COMMENTlink modified 2.3 years ago by pbpanigrahi190 • written 2.3 years ago by tujuchuanli80

Do you really need this on a global level or would per-gene comparisons work? That'd be much more meaningful.

ADD REPLYlink written 2.3 years ago by Devon Ryan97k

Hi Devon,

I am looking at a similar exploration to tujuchuanli's. Would you mind explaining what you meant by a per-gene comparisons/what would that look like?

ADD REPLYlink written 23 months ago by Will0

It's more likely that there's a coherent relationship between methylation and gene expression if one looks at individual genes than globally, since they relationship (think slope) won't be the same between genes and you'll probably be left with a big blob of dots and no way to coherently fit things.

ADD REPLYlink written 23 months ago by Devon Ryan97k

Standardizing the data (i.e. z-score transformation) is a linear transformation and Pearson's correlation is unaffected by linear transformation of the variables so you'll get the same result whether using the raw data or the standardized one.

ADD REPLYlink written 2.3 years ago by Jean-Karim Heriche23k

Yes, I need this. What I talking about is that the expression level of genes in my list could be controlled by DNA methylation level. This is only way as far as I know (I know it from reading papers. it can be viewed in scatter plot) If you know a better way, please tell me. Thanks

ADD REPLYlink written 2.3 years ago by tujuchuanli80
0
gravatar for pbpanigrahi
2.3 years ago by
pbpanigrahi190
pbpanigrahi190 wrote:

As stated by Jean, correlation is independent of scales. Whether you do normalization or any kind of transformation to data, correlation will be same.

One suggestion I can give you that, instead of simply averaging out intensities of all methylation probes for a given gene (i.e. one gene one methylation level), you can cluster probes based on distance and intensity values (methylmix package has probe clustering function ClusterProbes) so you can have more than one clusters of probes per gene. So in case you wont find correlation wr.t. one cluster, you may see correlation w.r.t other cluster.

Explore this thread

ADD COMMENTlink written 2.3 years ago by pbpanigrahi190
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 739 users visited in the last hour