I'm trying to recreate an analysis of ChIP-seq data from SRP051788.
I used bowtie2 for mapping and then I used HOMER to generate peak files and am using these to annotate using the hg19 genome using the following command:
annotatePeaks.pl peaks.txt hg19 > annotated_peaks.txt
It runs fine but when I look at the file there are no annotations and no gene IDs and I can't figure out why.
Here is the first couple lines of a peak file I used for annotation:
# HOMER Peaks # Peak finding parameters: # tag directory = Bcatenin_WNT3a/ # # total peaks = 14292 # peak size = 200 # peaks found using tags on both strands # minimum distance between peaks = 400 # fragment length = 156 # genome size = 2000000000 # Total tags = 15622676.0 # Total tags in peaks = 1025145.0 # Approximate IP efficiency = 6.56% # tags per bp = 0.007238 # expected tags per peak = 1.448 # maximum tags considered per bp = 1.0 # effective number of tags used for normalization = 10000000.0 # Peaks have been centered at maximum tag pile-up # number of putative peaks = 14620 # # input tag directory = H1_Input/ # Fold over input required = 4.00 # Poisson p-value over input required = 1.00e-04 # Putative peaks filtered by input = 214 # # size of region used for local filtering = 10000 # Fold over local region required = 4.00 # Poisson p-value over local region required = 1.00e-04 # Putative peaks filtered by local signal = 114 # # Maximum fold under expected unique positions for tags = 2.00 # Putative peaks filtered for being too clonal = 0 # # cmd = findPeaks Bcatenin_WNT3a/ -style factor -size 200 -tagThreshold 30 -o auto -i H1_Input/ # # Column Headers: #PeakID chr start end strand Normalized Tag Count focus ratio findPeaks Score Total Tags (normalized to Control Experiment) Control Tags Fold Change vs Control p-value vs Control Fold Change vs Local p-value vs Local Clonal Fold Change GL000220.1-1 GL000220.1 134369 134569 + 2451.6 0.854 374.000000 1985.5 44.0 45.12 0.00e+00 19.91 0.00e+00 0.54 GL000220.1-2 GL000220.1 124783 124983 + 905.7 0.760 370.000000 733.5 44.0 16.67 0.00e+00 9.49 0.00e+00 0.54 2-1 2 171571590 171571790 + 338.0 0.860 275.000000 273.7 0.5 547.43 0.00e+00 26.05 0.00e+00 0.67 1-2 1 33202423 33202623 + 275.2 0.806 246.000000 222.9 0.5 445.82 0.00e+00 83.94 0.00e+00 0.71 4-1 4 140545690 140545890 + 274.0 0.864 246.000000 221.9 0.5 443.75 0.00e+00 50.41 0.00e+00 0.70
I've tried adding 'chr' to the start of the chrom start column since it looked like that was the way the annotation file was set up but that didn't fix the problem. I've also tried to delete the header since I read that might fix the problem. a problem with peak annotation after ChIP-seq
I'll try some other annotation programs and to get peak files from MACS2 but since I am new to doing ChIP-seq analysis I was hoping to recreate some other researchers' pipelines before I get my own data back for analysis, so I'd like to try to get HOMER to work also.