Test For Spatial Randomness Of Genomic Features In A Poorly Assembled Genome
7.9 years ago
bewickaj ▴ 10

Hi all,

I am working with a very poorly assembled genome (~450,000 scaffolds, max. scaffold length ~250,000 bp) and using a scaffold map I am able to place some genomic features I am interested in on linkage groups (LG). I would like to test the null that these genomic features are randomly distributed within each LG and across the entire map, and I was thinking about using a Kolmogorov–Smirnov test. However, due to the incompleteness of the assembly and map I am worried that I would get a false-positive - the features are non-randomly distributed. There is always a chance of a false-positive, but I am particularly worried because, visually, there are large noticeable gaps of missing information on each LG. Questions: Have I landed on the correct test? Is there a more robust one to handle highly fragmented data? Is a test not even possible for my situation? Is there a clever way of testing distributions in a genomic-framework?

Thanks for giving me your time!

