Question: Genomic feature overlaps taking into account underlying sequence feature distributions
gravatar for Sakti
6.1 years ago by
United States
Sakti440 wrote:

Dear Biostars,

Once again I am here consulting your knowledge. I have studied binding sites for various proteins and have mapped their positions along chr3 in mouse. After plotting the data I recognized that several of these binding sites are clustered along stretches of the chr3 sequence. 

I am trying to assess the significance of overlaps of my protein binding sites with other protein binding sites, in a manner that takes into account the underlying biased location distributions of my data. 

I used the bedtools shuffle and intersect programs like this:

bedtools shuffle -chrom -i mybindingsitespos.bed -g mm9.chr.sizes | \ bedtools intersect -a otherproteinbindsitespos.bed -b - | wc -l

However shuffle randomly chooses locations along the chr3 sequence, and does not take into account the background distribution of my data. How can I feed these programs that distribution, or how could I implement the same analysis using R??

I've been searching all over the net and wasn't able to find much. I'd appreciate any comment on your part!!!

ADD COMMENTlink modified 6.1 years ago • written 6.1 years ago by Sakti440
Please log in to add an answer.


Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 988 users visited in the last hour