Question: How to generate random intervals for human genome which do not have overlap with other bed file
1
gravatar for npy179
3.9 years ago by
npy17910
United States
npy17910 wrote:

I want to generate 1000000 random intervals on the human genome as background sequence, so these intervals do not have overlap with my foreground sequence intervals, my solution is generate the random intervals using bedtools random, and then delete the overlap part. Do you have any better solutions? Any comment or suggestions will be appreciated?  

myposts software error genome • 1.7k views
ADD COMMENTlink modified 3.9 years ago by James Ashmore2.7k • written 3.9 years ago by npy17910
1
gravatar for James Ashmore
3.9 years ago by
James Ashmore2.7k
UK/Edinburgh/MRC Centre for Regenerative Medicine
James Ashmore2.7k wrote:
# Create 1 million random intervals
bedtools random -g hg38.tsv -n 1000000 > random.bed
# Shift intervals, but do not place them in the foreground regions
bedtools shuffle -i random.bed -g hg38.tsv -excl foreground.bed > background.bed
ADD COMMENTlink written 3.9 years ago by James Ashmore2.7k
0
gravatar for Pierre Lindenbaum
3.9 years ago by
France/Nantes/Institut du Thorax - INSERM UMR1087
Pierre Lindenbaum124k wrote:

use your bed as a genome file ?

awk '{printf("%s_%s_%s\t%d\n",$1,$2,$3,int($3)-int($2));}' your.bed > my.genome

then use 'random' to generate the new bed

use awk to decode back the coordinateq and the chromosomes

 

 

ADD COMMENTlink written 3.9 years ago by Pierre Lindenbaum124k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 2372 users visited in the last hour