Question: Correlation between gene expression and methylation
0
gravatar for tujuchuanli
13 months ago by
tujuchuanli40
tujuchuanli40 wrote:

I have a list of genes and want to test whether the expression level of genes in this list could correlate with DNA methylation level. I verify my hypothesis in TCGA breast cancer. below is my plan

planA:

  1. Extract the expression and methylation level for each gene in my list. Expression can be defined as RPKM from RNA-seq data and methylation level from probe in the promoter region of this gene (from -3kb to 500bp around TSS. if there are multple probes in this region, I prefer to average these probe values as final methylation level value for this gene).
  2. calculating the correlation between these two data (eg. pearson correlation coefficient). if the P-value is significant I can say that there is a significant correlation between these two data.

planB:

  1. Calculating Z score of gene expression for each gene (z score as (value - mean normal)/SD normal).
  2. Calculating Z score of methylation level for each gene (z score as (value - mean normal)/SD normal). from -3kb to 500bp around TSS. if there are multple probes in this region, I prefer to average these probe values. then to calculate Z score.
  3. calculate the correlation coefficient just as metioned above.

which could be better? if you have suggestions please tell me.

Thanks

ADD COMMENTlink modified 13 months ago by pbpanigrahi180 • written 13 months ago by tujuchuanli40

Do you really need this on a global level or would per-gene comparisons work? That'd be much more meaningful.

ADD REPLYlink written 13 months ago by Devon Ryan91k

Hi Devon,

I am looking at a similar exploration to tujuchuanli's. Would you mind explaining what you meant by a per-gene comparisons/what would that look like?

ADD REPLYlink written 9 months ago by Will0

It's more likely that there's a coherent relationship between methylation and gene expression if one looks at individual genes than globally, since they relationship (think slope) won't be the same between genes and you'll probably be left with a big blob of dots and no way to coherently fit things.

ADD REPLYlink written 9 months ago by Devon Ryan91k

Standardizing the data (i.e. z-score transformation) is a linear transformation and Pearson's correlation is unaffected by linear transformation of the variables so you'll get the same result whether using the raw data or the standardized one.

ADD REPLYlink written 13 months ago by Jean-Karim Heriche20k

Yes, I need this. What I talking about is that the expression level of genes in my list could be controlled by DNA methylation level. This is only way as far as I know (I know it from reading papers. it can be viewed in scatter plot) If you know a better way, please tell me. Thanks

ADD REPLYlink written 13 months ago by tujuchuanli40
0
gravatar for pbpanigrahi
13 months ago by
pbpanigrahi180
pbpanigrahi180 wrote:

As stated by Jean, correlation is independent of scales. Whether you do normalization or any kind of transformation to data, correlation will be same.

One suggestion I can give you that, instead of simply averaging out intensities of all methylation probes for a given gene (i.e. one gene one methylation level), you can cluster probes based on distance and intensity values (methylmix package has probe clustering function ClusterProbes) so you can have more than one clusters of probes per gene. So in case you wont find correlation wr.t. one cluster, you may see correlation w.r.t other cluster.

Explore this thread

ADD COMMENTlink written 13 months ago by pbpanigrahi180
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 2023 users visited in the last hour