Entering edit mode
8.6 years ago
tonja.r
▴
600
I have found some strange thing in this dataset. Namely, I have exons that correspond to one gene_id
, but there is no gene with such gene_id
in a dataset.
mm9 = TxDb.Mmusculus.UCSC.mm9.knownGene
exon dataset has a gene with ID 100038977
exon = exons(mm9)
gene_id_exons = select(mm9, keys=as.character(exon$exon_id), columns = c("GENEID"), keytype = "EXONID")
> gene_id_exons[which(gene_id_exons$GENEID == "100038977"),][1:4,]
EXONID GENEID
243122 241618 100038977
243123 241619 100038977
243124 241620 100038977
243125 241621 100038977
gene dataset does not have a gene with such ID
gene<-genes(mm9)
> which(gene$gene_id == "100038977")
integer(0)
Why are there exons that belong to the 100038977 gene (Gm1993) but there is no such gene listed in the gene dataset?
The same happens with gene_id
s 100039550 (Gm10486),100039890 (Gm15093),100039939 (Gm2506), 100040048 (Ccl27b), 100040631 etc
You might want to post this on the bioconductor support forum. I expect that the various Txdb packages are constructed with a script...perhaps it has a bug.