Question: Which strategy to analyse scores on multiple DNA positions over samples ?
0
gravatar for alorsonmethyle
5 weeks ago by
alorsonmethyle20 wrote:

Hi,

I'm writing here today because I'm looking for a strategy to analyse my dataset.

Rapidly, we have a dataset of WGBS, with methylation scores. In that subset, we're looking for methylation scores for 20 positions, which are low (between 0 and 5%) over >50 samples. I'd like to know whether it is the same samples which are always the "most" methylated.

I have a dataset with columns : position sample methylation_score

I don't really have an idea on how to carry my analysis on. This is where I stopped : - the scores are low (but probably have a functional role for what we are looking for) - the distribution may not be homogeneous amongst samples and positions - I thought about using a ranking test (like a correlation ranking test like spearman, but I'm blocked by the fact I have there two qualitative data : position and samples. I though about PCA, but I only have one quantitative dimension. I thought about Kruskall-Wallis, that gives me a significant p-value Then I tried to rank all the scores and give a score based on the normalised rank of the methylation score, I'm not really sure of this approach.

So, how would you set up a strategy to know basically if it's the same samples that are likely to be amongst "the most methylated" positions ?

I hope I'm clear enough, otherwise, please tell me how can I refine what I'd like to achieve

Best,

ADD COMMENTlink written 5 weeks ago by alorsonmethyle20
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1370 users visited in the last hour