Question

Collapse overlapping annotations in R

2

Entering edit mode

2.6 years ago

James ▴ 20

I am trying to use the R package groHMM to analyse some data and it wants me to Collapse overlapping annotations but I am working with Arabidopsis and not human data. groHMM uses the R package GenomicFeatures for annotations. I am using TxDb.Athaliana.BioMart.plantsmart28 but the tool wants me to collapse overlapping annotations so that overlapping transcripts are merged into "a single set, in which each annotation represents the 5' and 3' most boundaries of genes". I do not know how to do this in R and the example code only works for human data (hg19). If anyone knows how to get it to work for the Arabidopsis data, that would be really helpful.

The example code in the package works like this:

library(TxDb.Hsapiens.UCSC.hg19.knownGene)
kgdb <- TxDb.Hsapiens.UCSC.hg19.knownGene
kgChr7 <- transcripts(kgdb, filter=list(tx_chrom = "chr7"), columns=c("gene_id", "tx_id", "tx_name"))
library(org.Hs.eg.db)
kgConsensus <- makeConsensusAnnotations(kgChr7, keytype="gene_id", mc.cores=getOption("mc.cores"))
map <- select(org.Hs.eg.db, keys=unlist(mcols(kgConsensus)$gene_id), columns=c("SYMBOL"), keytype=c("ENTREZID"))

While my attempt at using a similar Arabidopsis database fails:

map <- select(org.At.tair.db, keys=unlist(mcols(atConsensus)$gene_id), columns=c("SYMBOL"), keytype=c("ENTREZID"))
Error in .testForValidKeys(x, keys, keytype, fks) :
  None of the keys entered are valid keys for 'ENTREZID'. Please use the keys method to see a listing of valid arguments.

Even if I cannot get this exact code to work, anything that can give me the correct set of collapsed transcripts that groHMM can accept, would be great.

transcript GenomicFeatures Genomics annotations groHMM R • 582 views

ADD COMMENT • link 2.6 years ago by James ▴ 20