Help generating multiple annotations under one H3K27me3 peak
0
1
Entering edit mode
3 months ago
Rory Osborne ▴ 10

Hi everyone, first post here!

I'm currently working on a ChIPseq data set enriching for H3K27me3 between four Arabidopsis genotypes.

My current problem is at the annotation step using annotatepeak from ChIPseeker. Because the peaks I have called (using MACS2) are broad, they cover multiple loci in the genome, but only become annotated with the closest TSS instead of say, three different loci. As a result, identifying differentially methylated regions between my WT and mutants is quite error prone when I look at the bed files in IGV. I've tried to show an example in the image attached.

In this example, the peak covering AT2G19000 is showing as not methylated in my global annotation for Mutant 1 because it has mapped only to the next gene across, AT2G19010

This is the line of code I use in R:

WT_consensus_anno <- annotatePeak(WT_covered_granges, tssRegion=c(-3000, 3000),
TxDb=txdb, annoDb="org.At.tair.db")


Where txdb is extracted from the TxDb.Athaliana.BioMart.plantsmart28 package, and WT_covered_granges is a GRanges object containing the called peaks from MACS2.

Any help would be much appreciated, I've been puzzling over this for a few days and searched everywhere for answers.

If I've left out any information please let me know

Thanks a lot!

ChIP-Seq R chipseeker • 271 views
0
Entering edit mode

One thing that ChIPseeker has is the flank option. See here. It defines a flanking region around the peaks, and reports the genes that overlap the flanking region.

I am not 100% sure how is the output defined. What I did was bypassing the ChIPseeker. I needed to find the distances from the TSS only and not annotate other elements. So I extracted the TSS from the GTF file. Then with bedtools window, I have reported the TSS which overlapped my peaks file and calculated the distance from it using awk.