Question: Human 450K array data and rnaseqv2 data
0
gravatar for arpit.singh1203
4.5 years ago by
India
arpit.singh120360 wrote:

Hello All

"I had planned to carry out 450K human methylation data analysis and gene expression analysis to corelate differentially expressed genes to differentially methylated regions for publicly available cancer data from TCGA."

I had used COHCAP to find DMRs in 75 samples (65 T + 10 N).[450 K human methylation data level 1 tcga]

Next I used edgeR to find differentially expressed genes in 68 samples (65T + 3N) (rnaseq data was missing for 7 N samples) [rnaseqv2 data level3 tcga]

For a proof of concept, I extracted 458 genes from CpG_island_filtered-Avg-by_Island.xlsx (from COHCAP analysis

and only 36 downregulated genes from edgeR analysis.

Only 2 genes were common in both the lists. I am worried if I am missing some important steps as I had expected at least 20 - 30 genes that would be commonly related in both the COHCAP analysis(450K methylation array level 1) and edgeR analysis (rnaseqv2 level 3).

Any suggestions or improvements are welcome!

edger cohcap epigenetics rnaseq • 1.7k views
ADD COMMENTlink modified 4.5 years ago • written 4.5 years ago by arpit.singh120360
1

Are the 65 tumor samples matched in the methylation and expression data?  If so, why not simply perform a correlation instead of trying to compare gene sets?

ADD REPLYlink written 4.5 years ago by Sean Davis25k

Yes the 65 samples (Tumor) are matched with unique tcga ID. I am afraid I don't understand what you mean by co-relating rather than comparing. Could you please discuss a little more on that?

ADD REPLYlink written 4.5 years ago by arpit.singh120360
1

Compute the correlation between the expression and methylation for each gene directly using pearson correlation, for example.  

ADD REPLYlink written 4.5 years ago by Sean Davis25k

I ran DESeq2 on my data and got zero DE genes. I think the following reasons could cause this:

  1. Data is taken from TCGA and only 3 samples are "Normal" while 65 are "Tumor".
  2. Not having replicates for the data.

Do you think these could be the reasons? And is that why you suggested me to "compute correlation between the expression and methylation for each gene directly using pearson correlation" ?

PS: All rows containing zeroes were removed.

ADD REPLYlink written 4.5 years ago by arpit.singh120360
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 676 users visited in the last hour