Hello all, Where can I download all the chromosomal coordinates for the entire hg19 reference genome? Additionally, can anyone help me generate a python script that will split the whole genome into 50kbp windows? My draft python code is given below
step = 50000
with open('Chr.bed', 'w') as outfp:
for val in range(0, 300000000, step):
outfp.write('{0}\t{1}\t{2}\n'.format('chr', val, val+step))
I would appreciate any help and suggestions. Thank you
Thank you for your suggestions! I will check out bedtools for making the genomic windows. Thank you once again
Thank you all for your expert insights! Bedtools and bedops are a better option than just writing my own python script. My purpose of generating these windows is to find the CNV frequency in each genomic intervals for 200,000 CNV coordinates in my dataset. Bedtools intersect function and -c flag seems a good option to find the overlap of CNV coordinates within those windows. The end goal is to see which interval or coordinates of genomes do not have any CNVs at all. And thank you once again guys for your expert insights. Really appreciate it.
If you want bins without CNVs:
That
awk
bit can be adjusted to get bins with varied numbers of CNVs.Thank you so much for your help. Realy appreciate it. I will give it a try. Thanks once again