Hi
I am working on a simple algorithm to enable an aligner to handle structural variations, for the first step i need to know the distribution of the length of Structural Variations, for this purpose i need a database (or a reliable file)* providing structural variation information for me to calculate the distribution, anyone knows a database providing this for me? or any other way to this task?
the best thing i want is the estimated distribution but even knowing some statistics could be good, for example "90% of SVs are less than 200 bases long" or some other evident information like this.
the human genome SVs length Distribution will be good but having some wider information is better. *: update thanks everyone
why do you need a database ? a simple text file and R would be enough , isn't it ?
You just need the regions and calculate the length and plot the histogram in R for the column having the length, that is it.
yeah i need the regions, so where to get them?
yeah that could be good, but where?
"where" what ?
i need a simple text file and then do this in R :) i need that information then i know what to do with it :) now the problem is the data itself, ""where to find the length of Structural Variation?"" thanks
i need a simple text file and then do this in R :) i need that information then i know what to do with it :) now the problem is the data itself, where to find the length of Structural Variation? thanks