How to show statistical significance of overlap of CTCF binding sites
1
0
Entering edit mode
3.1 years ago
samuel ▴ 240

I have carried out RNA-Seq for lncRNAs. I have 8 lncRNAs that are differentially expressed and all 8 overlap CTCF binding sites when viewed in Ensembl genome browser. How do I go about proving that this is statistically significant and not happening by chance? I have read in papers where they have done something similar but they never state how they did it. Also could someone tell me where I actually find out exactly how many CTCF binding sites there are? (for human) . I cannot find this information from Ensembl. Many thanks.

RNA-Seq genome • 670 views
ADD COMMENT
2
Entering edit mode
3.1 years ago

You can calculate this using a hypergeometric test, which is also commonly used for GO enrichment. In short, it will calculate the probability of randomly picking 8 CTCF sites without replacement, and having all 8 sites match to your 8 target transcripts.

All of the CTCF binding sites can be accessed from the Ensembl regulation dataset in biomart here.

ADD COMMENT

Login before adding your answer.

Traffic: 3148 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6