Difference in distribution of two sets of sites/positions along genome
0
0
Entering edit mode
8.6 years ago
dmiuso • 0

Hi

I have two sets of genome positions (sites) on mouse genome. One has about 13000 sites (let's call it background set), another has about 400 sites and it is a subset of the background set. I would like to check if distribution (density?) along the genome has local difference between two sets (13000 and 400).

I am very novice at R and this type of bioinformatics, so, would appreciate very much advice in both statistical test to apply as well as R package to be potentially used. Thanks!

R genome • 1.2k views
ADD COMMENT
0
Entering edit mode

Could you clarify what you're trying to do? If your 400 sites are a subset of 13000 sites, what do you expect to be different about them? If the 400 sites are a random sample from a population of 13000 then you don't expect any statistical difference. If the 400 sites are not a random sample from the 13000, how they were obtained/selected could tell you what's different between them and the others.

ADD REPLY
0
Entering edit mode

Thanks a lot for asking and trying to help, Jean-Karim!

I believe they can differ in density (local) along genome. The background of the story is following. We run Affymetrics Human methylation 450K kit on mouse genome. Alignment of these sites (probes) showed that about 13000 of them have 3 and fewer mismatches on mouse genome, so, we took them to work with. Out of these 13000 sites, about 400 turned out to be significantly hypomethylated in knock out mouse vs. wild type. I want to check if these 400 sites have the same distribution (density) along genome as distribution of "background" 13000 sites (which is by far not even itself). Dmitry

ADD REPLY

Login before adding your answer.

Traffic: 2006 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6