I am trying to use the R package
groHMM to analyse some data and it wants me to Collapse overlapping annotations but I am working with Arabidopsis and not human data.
groHMM uses the R package
GenomicFeatures for annotations. I am using
TxDb.Athaliana.BioMart.plantsmart28 but the tool wants me to collapse overlapping annotations so that overlapping transcripts are merged into "a single set, in which each annotation represents the 5' and 3' most boundaries of genes". I do not know how to do this in R and the example code only works for human data (hg19). If anyone knows how to get it to work for the Arabidopsis data, that would be really helpful.
The example code in the package works like this:
library(TxDb.Hsapiens.UCSC.hg19.knownGene) kgdb <- TxDb.Hsapiens.UCSC.hg19.knownGene kgChr7 <- transcripts(kgdb, filter=list(tx_chrom = "chr7"), columns=c("gene_id", "tx_id", "tx_name")) library(org.Hs.eg.db) kgConsensus <- makeConsensusAnnotations(kgChr7, keytype="gene_id", mc.cores=getOption("mc.cores")) map <- select(org.Hs.eg.db, keys=unlist(mcols(kgConsensus)$gene_id), columns=c("SYMBOL"), keytype=c("ENTREZID"))
While my attempt at using a similar Arabidopsis database fails:
map <- select(org.At.tair.db, keys=unlist(mcols(atConsensus)$gene_id), columns=c("SYMBOL"), keytype=c("ENTREZID")) Error in .testForValidKeys(x, keys, keytype, fks) : None of the keys entered are valid keys for 'ENTREZID'. Please use the keys method to see a listing of valid arguments.
Even if I cannot get this exact code to work, anything that can give me the correct set of collapsed transcripts that groHMM can accept, would be great.