Genomic feature overlaps taking into account underlying sequence feature distributions
0
0
Entering edit mode
9.2 years ago
Sakti ▴ 510

Dear Biostars,

Once again I am here consulting your knowledge. I have studied binding sites for various proteins and have mapped their positions along chr3 in mouse. After plotting the data I recognized that several of these binding sites are clustered along stretches of the chr3 sequence.

I am trying to assess the significance of overlaps of my protein binding sites with other protein binding sites, in a manner that takes into account the underlying biased location distributions of my data.

I used the bedtools shuffle and intersect programs like this:

bedtools shuffle -chrom -i mybindingsitespos.bed -g mm9.chr.sizes | \ bedtools intersect -a otherproteinbindsitespos.bed -b - | wc -l

However shuffle randomly chooses locations along the chr3 sequence, and does not take into account the background distribution of my data. How can I feed these programs that distribution, or how could I implement the same analysis using R?

I've been searching all over the net and wasn't able to find much. I'd appreciate any comment on your part!

genomics bedtools R genome-features • 2.0k views
ADD COMMENT

Login before adding your answer.

Traffic: 2640 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6