Testing for over-representation of chip-peaks
1
1
Entering edit mode
8.4 years ago
Bioradical ▴ 60

I am trying to test for over-representation of a set of overlapped chip-seq peaks between constitutive exons and alternatively spliced exons.

I have a bed file that contains my overlapped chip peaks, a bed file that contains constitutive exons and a bed file that contains alternatively spliced exons (generated using a custom script provided by some authors of a paper). I am interested in running a statistical test that tells me whether my overlapped peaks are over-represented / enriched in my constitutive exon file, or alternatively spliced exon file individually.

I thought about running a hypergeometric test using the phyper function in R. But I'm not quite sure what numbers I would use specifically.

I also attempted to use the bedtools fisher test by using my overlapped chip-seq peak file and testing that against my con exon file and then my alt exon file seperately. This returned a p-value of 0 for both which I guess doesn't make much sense (though I am not very math-oriented). I mostly work on wet-lab stuff as an assistant.

Any help is appreciated.

Overrepresentation ChIP-Seq R Stats • 2.7k views
ADD COMMENT
4
Entering edit mode
7.2 years ago
bernatgel ★ 3.4k

You can use the R/Bioconductor package regioneR for this. It implements a statistical test for the association of genomic regions (such as chip peaks and exons) based on random permutations.

In this case I think the best aproach would be to "flip" the question and ask whether alternatively spliced (or constitutive) exons tend to be associated with the chip peaks and use the "resampling" randomization strategy.

For example (untested code!)

library(regioneR)

chip.peaks <- toGRanges("chip.peaks.bed")
alt.exons <- toGRanges("alt.exons.bed")
const.exons <- toGRanges("const.exons.bed")

all.exons <- c(alt.exons, const.exons)

pt <- permTest(A=alt.exons, B=chip.peaks, universe=all.exons, 
           randomize.function = resampleRegions, evaluate.function = numOverlaps, 
           ntimes = 1000)

pt
plot(pt)

This will create 1000 random sets of exons and test if the alt.exons are more associated with the peaks than one could expect by chance.

You can find more information about how to use regioneR and about permutation tests in the package vignette.

ADD COMMENT

Login before adding your answer.

Traffic: 2228 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6