Question: measure the distribution bias in genomic features
gravatar for Hughie
5 months ago by
Hughie20 wrote:

Hi everyone,

I'm recently analyzing DNA methylation data and facing an obstacle problem here:

As we know that the DNA methylation distribution can vary differently in genomic features (core promoter, enhancer, CpGIsland, etc). I want to measure the distribution bias among these genomic features now.
In other words, I want to know the deviation between expected and observed DNA methylation sites number?

I read some papers and found various methods used in this analysis, for example, independent t-test, Chi-square test, Mann-Whitney U test, permutation test, etc, which made me really confused on choosing.

I have tried the independent t-test and calculated the ratio = log2(mean of observed/mean of expected) for plotting heatmap (In this result, if the ratio > 0, I will say DNA methylation occurs more often in this region and vice verse). However, someone told me that the Chi-square test may better on measuring the difference between observed and expected. I also tried this too. However, I can only get a chi-value for each genomic feature, which varies a lot (from 300 - 40000000), difficult for visualization.

So, I have several questions:

  1. Which methods do you think is better for this kind of problem?
  2. If Chi-square distribution is used, how to properly handle the chi-value for visualization (normalize the chi-square of each region to a random region?)
  3. I noticed the p-value is typically small (10e-100 often and even 0 reported), I referred some answer on how to handle very large dataset for statistical test and find there are no clear conclusions. So, if you make statistical test on a very large dataset (typically, sample size in the level of 10e6 is usual in bioinformatics), how do you handle the very small p-value?

Thanks for your time, really appreciate any answers!

statisticas • 143 views
ADD COMMENTlink modified 5 months ago by Biostar ♦♦ 20 • written 5 months ago by Hughie20
Please log in to add an answer.


Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1540 users visited in the last hour