Question: Which strategy to analyse scores on multiple DNA positions over samples ?
gravatar for alorsonmethyle
5 weeks ago by
alorsonmethyle20 wrote:


I'm writing here today because I'm looking for a strategy to analyse my dataset.

Rapidly, we have a dataset of WGBS, with methylation scores. In that subset, we're looking for methylation scores for 20 positions, which are low (between 0 and 5%) over >50 samples. I'd like to know whether it is the same samples which are always the "most" methylated.

I have a dataset with columns : position sample methylation_score

I don't really have an idea on how to carry my analysis on. This is where I stopped : - the scores are low (but probably have a functional role for what we are looking for) - the distribution may not be homogeneous amongst samples and positions - I thought about using a ranking test (like a correlation ranking test like spearman, but I'm blocked by the fact I have there two qualitative data : position and samples. I though about PCA, but I only have one quantitative dimension. I thought about Kruskall-Wallis, that gives me a significant p-value Then I tried to rank all the scores and give a score based on the normalised rank of the methylation score, I'm not really sure of this approach.

So, how would you set up a strategy to know basically if it's the same samples that are likely to be amongst "the most methylated" positions ?

I hope I'm clear enough, otherwise, please tell me how can I refine what I'd like to achieve


ADD COMMENTlink written 5 weeks ago by alorsonmethyle20
Please log in to add an answer.


Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1370 users visited in the last hour