Question: Error while annotating with Chipseeker
0
gravatar for hamaor
11 weeks ago by
hamaor0
hamaor0 wrote:

I followd the Diffbind pipeline for a ChipSeq data:

peaksets2 = dba(sampleSheet="SampleSheet.csv",config = data.frame(RunParallel=TRUE, AnalysisMethod=DBA_DESEQ2, bCorPlot=FALSE, bUsePval=TRUE, minQCth=10),minOverlap = 2)

peaksets2 <- dba.count(peaksets2)
peaksets2 <- dba.contrast(peaksets2, categories=DBA_FACTOR)
peaksets2 <- dba.analyze(peaksets2,bCorPlot=FALSE)

than I tried to annotate the peaks with Chipseeker:

txdb <- TxDb.Hsapiens.UCSC.hg19.knownGene
peakAnno_inAll = annotatePeak(peaksets2.DB, TxDb=txdb, annoDb = "org.Hs.eg.db", level = "gene")

but I keep get the same error:

> peakAnno_inAll = annotatePeak(peaksets2.DB, TxDb=txdb, annoDb = "org.Hs.eg.db", level = "gene")
>> preparing features information...         2019-02-25 14:26:15 
>> identifying nearest features...       2019-02-25 14:26:15 
>> calculating distance from peak to TSS...  2019-02-25 14:26:15 
>> assigning genomic annotation...       2019-02-25 14:26:15 
Error in `[<-.data.frame`(`*tmp*`, tssIndex, "Promoter", value = TRUE) : 
  replacement has 1 row, data has 0
In addition: Warning messages:
1: In .Seqinfo.mergexy(x, y) :
  The 2 combined objects have no sequence levels in common. (Use
  suppressWarnings() to suppress this warning.)
2: In .Seqinfo.mergexy(x, y) :
  The 2 combined objects have no sequence levels in common. (Use
  suppressWarnings() to suppress this warning.)
3: In .Seqinfo.mergexy(x, y) :
  The 2 combined objects have no sequence levels in common. (Use
  suppressWarnings() to suppress this warning.)
4: In .Seqinfo.mergexy(x, y) :
  The 2 combined objects have no sequence levels in common. (Use
  suppressWarnings() to suppress this warning.)
5: In .Seqinfo.mergexy(x, y) :
  The 2 combined objects have no sequence levels in common. (Use
  suppressWarnings() to suppress this warning.)

what do I do wrong?

ADD COMMENTlink modified 11 weeks ago • written 11 weeks ago by hamaor0

I have not tried to annotate peaks with Chipseeker; I use Homer's annotatePeaks.pl to annotate peaks.

annotatePeaks.pl peakfile hg19  > output.txt
ADD REPLYlink modified 11 weeks ago • written 11 weeks ago by patelk2620

could you send me the link for the .pl script? thanks

ADD REPLYlink written 11 weeks ago by hamaor0

http://homer.ucsd.edu/homer/download.html - You will have to download homer software which includes this annotation script http://homer.ucsd.edu/homer/ngs/annotation.html - here's a link with detailed explanation of how Homer performs annotation of peaks.

ADD REPLYlink modified 11 weeks ago • written 11 weeks ago by patelk2620

thanks. I download it and tried using it, please see my comment about it

ADD REPLYlink written 11 weeks ago by hamaor0
0
gravatar for hamaor
11 weeks ago by
hamaor0
hamaor0 wrote:

I generated my peaks using MACS2 and a peak file is looks like that:

 chr     start   end     length  abs_summit      pileup  -log10(pvalue)  fold_enrichment -log10(qvalue)  name
1       1024833 1025212 380     1024925 26.00   12.93083        5.17873 7.28933 P53_1_peak_1
1       10474802        10475143        342     10474959        21.00   7.06303 3.39586 2.57685 P53_1_peak_2

As it not correlated with the format supported by Homer, I modified it to look like that:

PeakID  chr     start   end     strand  length  abs_summit      pileup  #NAME?  fold_enrichment #NAME?  name
1       1       1179379 1179721 *       343     1179575 27      12.79556        4.97813 9.00724 PAX8_2_peak_1
2       1       1241319 1241739 *       421     1241490 36      20.38177        6.53943 16.17405        PAX8_2_peak_2
3       1       1283799 1284120 *       322     1283978 32      21.3853 7.75134 17.13212        PAX8_2_peak_3

By Adding PeakID column as first column, and "strand" column at the 5th column.

I'm running the Homer command: annotatePeaks.pl Homer_test_file.txt hg18 > outputfile.txt

and the final results ending with NA's for every peak:

PeakID (cmd=annotatePeaks.pl Homer_test_file.txt hg18)  Chr     Start   End     Strand  Peak Score      Focus Ratio/Region Size Annotation   Detailed Annotation      Distance to TSS Nearest PromoterID      Entrez ID       Nearest Unigene Nearest Refseq  Nearest Ensembl Gene Name    Gene Alias       Gene Description        Gene Type
101101915       2216    15      101101541       +       0       NA      NA      NA      NA      NA
74178602        3035    3       74178208        +       0       NA      NA      NA      NA      NA
35884393        1001    11      35884090        +       0       NA      NA      NA      NA      NA

I would love to get insight about what have I done wrong. Thanks a lot, Maor

ADD COMMENTlink written 11 weeks ago by hamaor0

Your peakfile should have these five tab-delimited columns -

  • Column1: Unique Peak ID
  • Column2: chromosome
  • Column3: starting position
  • Column4: ending position
  • Column5: Strand (+/- or 0/1, where 0="+", 1="-")

Please ensure your chromosomes have "chr" before the number.

ADD REPLYlink modified 11 weeks ago • written 11 weeks ago by patelk2620
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 646 users visited in the last hour