Question: Peak annotation using ChIPseeker (bioconductor package)
0
gravatar for swadha
2.6 years ago by
swadha20
University of California, United States
swadha20 wrote:

I want to do peak annotation of C.elegans ChIP data using ChIPseeker.

I have used this command peakAnno <- annotatePeak(files, tssRegion=c(-3000, 3000), TxDb=txdb, annoDb="org.Ce.eg.db")

I'm using 'org.Ce.eg.db' from BioConductor for this.

I get the following error:

Error in .testForValidKeys(x, keys, keytype, fks) : None of the keys entered are valid keys for 'ENSEMBL'. Please use the keys method to see a listing of valid arguments.

I would really appreciate if someone could help me resolve this error.

chip-seq • 2.2k views
ADD COMMENTlink modified 2.6 years ago • written 2.6 years ago by swadha20

What are you TxDb are you using?

ADD REPLYlink written 2.6 years ago by Sinji2.8k

I'm using TxDb.Celegans.UCSC.ce6.ensGene

ADD REPLYlink written 2.6 years ago by swadha20

Hmm, sounds like you're using the right datasets. Mind copy and pasting your code so I can take a look at it? Also put 'chipseeker' as a tag so that the author can get a notification about this question. He's pretty active around here.

ADD REPLYlink written 2.6 years ago by Sinji2.8k

require(ChIPseeker)

require(TxDb.Celegans.UCSC.ce6.ensGene)

txdb <-TxDb.Celegans.UCSC.ce6.ensGene

require(clusterProfiler)

files = "/home/swadha/Desktop/SFSU/c.elegans_peaks.bed"

peak <- readPeakFile(files)

covplot(peak, weightCol="V5")

promoter <- getPromoters(TxDb=txdb, upstream=3000, downstream=3000)

tagMatrix <- getTagMatrix(peak, windows=promoter)

tagHeatmap(tagMatrix, xlim=c(-3000, 3000), color="red")

peakHeatmap(files, TxDb=txdb, upstream=3000, downstream=3000, color="red")

plotAvgProf(tagMatrix, xlim=c(-3000, 3000), xlab="Genomic Region (5'->3')", ylab = "Read Count Frequency")

peakAnno <- annotatePeak(files, tssRegion=c(-3000, 3000), TxDb=txdb, annoDb="org.Ce.eg.db")

loading peak file... 2016-07-28 07:51:40 PM preparing features information... 2016-07-28 07:51:41 PM identifying nearest features... 2016-07-28 07:51:41 PM calculating distance from peak to TSS... 2016-07-28 07:51:42 PM assigning genomic annotation... 2016-07-28 07:51:42 PM adding gene annotation... 2016-07-28 07:51:43 PM

Error in .testForValidKeys(x, keys, keytype, fks) : None of the keys entered are valid keys for 'ENSEMBL'. Please use the keys method to see a listing of valid arguments.

How to tag someone on biostars??

ADD REPLYlink modified 2.6 years ago • written 2.6 years ago by swadha20
1

Try:

peaks <- readPeakFile("/home/swadha/Desktop/SFSU/c.elegans_peaks.bed", as = "GRanges")
peaksAnno <- annotatePeak(peaks, TxDb=txdb, annoDb="org.Ce.eg.db")

Might be useful to include a few lines of your bed file as well so we can take a look at that.

ADD REPLYlink modified 2.6 years ago • written 2.6 years ago by Sinji2.8k

not to tag someone, but the package name.

If you tag your post with chipseeker, I can receive email notification.

Please post a few lines of your bed file as suggested by @Sinji.

ADD REPLYlink modified 2.6 years ago • written 2.6 years ago by Guangchuang Yu2.1k
2
gravatar for Guangchuang Yu
2.6 years ago by
Guangchuang Yu2.1k
China/Hong Kong/The University of Hong Kong
Guangchuang Yu2.1k wrote:

I download a bed file from GEO and test it. Finally I figure out what's going on here.

This is due to the TxDb.Celegans.UCSC.ce6.ensGene object.

> txdb = TxDb.Celegans.UCSC.ce6.ensGene
> txdb
TxDb object:
# Db type: TxDb
# Supporting package: GenomicFeatures
# Data source: UCSC
# Genome: ce6
# Organism: Caenorhabditis elegans
# Taxonomy ID: 6239
# UCSC Table: ensGene
# Resource URL: http://genome.ucsc.edu/
# Type of Gene ID: Ensembl gene ID
# Full dataset: yes
# miRBase build ID: NA
# transcript_nrow: 35019
# exon_nrow: 146833
# cds_nrow: 127881
# Db created by: GenomicFeatures package from Bioconductor
# Creation time: 2015-10-07 18:15:25 +0000 (Wed, 07 Oct 2015)
# GenomicFeatures version at creation time: 1.21.30
# RSQLite version at creation time: 1.0.0
# DBSCHEMAVERSION: 1.1
> txidinfo <- transcripts(txdb, columns = c("tx_id", "tx_name", "gene_id"))
> head(txidinfo)
GRanges object with 6 ranges and 3 metadata columns:
      seqnames         ranges strand |     tx_id     tx_name         gene_id
         <Rle>      <IRanges>  <Rle> | <integer> <character> <CharacterList>
  [1]     chrI [11495, 16793]      + |         1  Y74C9A.2.2        Y74C9A.2
  [2]     chrI [11499, 16790]      + |         2  Y74C9A.2.3        Y74C9A.2
  [3]     chrI [11499, 16831]      + |         3  Y74C9A.2.1        Y74C9A.2
  [4]     chrI [11505, 16790]      + |         4  Y74C9A.2.4        Y74C9A.2
  [5]     chrI [11618, 16804]      + |         5  Y74C9A.2.5        Y74C9A.2
  [6]     chrI [43733, 44677]      + |         6    Y74C9A.1        Y74C9A.1
  -------
  seqinfo: 7 sequences (1 circular) from ce6 genome
>

Although the ID type recorded in the txdb metadata is ensemble, Type of Gene ID: Ensembl gene ID (also indicated by the ensGene part of the object name), it actually doesn't have gene ID available in the object.

As indicated in the above code, gene_id is actually transcript name but not ensembl gene ID.

The last step which throw error:

>> adding gene annotation... 2016-07-28 07:51:43 PM

Error in .testForValidKeys(x, keys, keytype, fks) : None of the keys entered are valid keys for 'ENSEMBL'. Please use the keys method to see a listing of valid arguments.

is trying to convert ENSEMBL ID toEntrezgene ID and gene Symbol using annoDb="org.Ce.eg.db" which of course will be failed since the ID is not ensemble but tx_name.

This issue actually had been fixed and you are using out-dated version of ChIPseeker.

Please follow my post when you got an issue.

In the latest release version of ChIPseeker, when annotatePeak fail to convert gene ID, it will throw warning and skip this step.

ADD COMMENTlink modified 2.6 years ago • written 2.6 years ago by Guangchuang Yu2.1k
0
gravatar for swadha
2.6 years ago by
swadha20
University of California, United States
swadha20 wrote:

Hi,

The bed files were generated from the tool macs2 using following command: macs2 callpeak -t HTZ.S36A.Ex39.Br.map.ce10.gz -n HTZ.S36A. -g ce --bdg --keep-dup=auto --broad --broad-cutoff=0.01 --nomodel)

Here is the link of the screenshot of my bed file.

Thanks for replying to my query so actively. I upgraded ChIPseeker (version:1.8.9), but the same error is occurring. How can I overcome this error??

ADD COMMENTlink modified 2.6 years ago • written 2.6 years ago by swadha20

just use:

peaksAnno <- annotatePeak(peaks, TxDb=txdb)
ADD REPLYlink written 2.6 years ago by Guangchuang Yu2.1k
0
gravatar for swadha
2.6 years ago by
swadha20
University of California, United States
swadha20 wrote:

That worked perfectly fine!! Thanks a ton Sinji and Guangchuang :)

ADD COMMENTlink written 2.6 years ago by swadha20
2
> peakAnno <- annotatePeak(peak, tssRegion=c(-3000, 3000), TxDb=txdb, annoDb="org.Ce.eg.db")
>> preparing features information...         2016-08-02 15:20:33
>> identifying nearest features...       2016-08-02 15:20:34
>> calculating distance from peak to TSS...  2016-08-02 15:20:38
>> assigning genomic annotation...       2016-08-02 15:20:38
>> adding gene annotation...             2016-08-02 15:20:45
Loading required package: org.Ce.eg.db

>> assigning chromosome lengths          2016-08-02 15:20:45
>> done...                   2016-08-02 15:20:45
Warning message:
In getGeneAnno(annoDb, peak.gr$geneId, type) :
  ID type not matched, gene annotation will not be added...

In ChIPseeker (version >=1.9.5), it works fine with this case.

ADD REPLYlink written 2.6 years ago by Guangchuang Yu2.1k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1033 users visited in the last hour