Question: Plotting TCGA's HM450 data
0
gravatar for vinayjrao
5 weeks ago by
vinayjrao110
JNCASR, India
vinayjrao110 wrote:

Hello,

I want to look at the methylation pattern of the promoter around a certain gene in different cancers; TCGA data is obtained from cBioPortal.

I want to know if the data (beta values) can be plotted directly, or do I have to normalize it by any way first.

Thanks in advance.

ADD COMMENTlink modified 5 weeks ago by Kevin Blighe37k • written 5 weeks ago by vinayjrao110
1
gravatar for Kevin Blighe
5 weeks ago by
Kevin Blighe37k
Republic of Ireland
Kevin Blighe37k wrote:

The β (beta) values are already normalised and should be measured in the range 0 - 1.0. You can plot them, no problem:

a

Take a look at the definition in this linked thread: A: M Value from Non-Normalized Methylated and Unmethylated Signal 450k

Kevin

ADD COMMENTlink modified 5 weeks ago • written 5 weeks ago by Kevin Blighe37k

Thanks a lot, Kevin. Can I directly correlate the methylation status with expression levels?

For the same, gene, I am looking at the expression level and methylation status. So, if the RSEM value is for example, around 13000 (median), in breast carcinoma, and 20000 for glioblastoma. Corresponding methylation status for these would be as you mentioned from 0 to 1. Can these data be directly correlated (will the methylation status be higher in breast cancer, or do I need to process the files in any way to correlate)?

Thanks again.

ADD REPLYlink written 5 weeks ago by vinayjrao110
1

Yes, you can correlate anything to anything, but a correlation will not reveal the underlying mechanism at play. I would not worry too much about the differences in scale between the expression data and the methylation data. Just ensure that the expression datasets are normalised in the same way. If using RSEM, it may help to log these in order to bring the distribution to a normal distribution.

There are some previous threads:

Slightly related but similar idea:

Another possibility, which is preferable, is to build linear regression models between each gene's expression and the corresponding methylation probes mapping to it (one-to-many). From these models, you can easily derive both a p-value and a r-squared value, along with many other things.

ADD REPLYlink written 5 weeks ago by Kevin Blighe37k

Thanks a lot, Kevin. You have been extremely helpful.

ADD REPLYlink written 5 weeks ago by vinayjrao110

I have just another question. Is it acceptable to correlate expression and methylation pattern by plotting a PCC? If yes, should it be done by adjusting the RSEM counts from 0 to 1, or should it be done with the raw values?

ADD REPLYlink written 5 weeks ago by vinayjrao110
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1155 users visited in the last hour