clusterprofiler gsea analysis
0
0
Entering edit mode
24 months ago

Hi all, I am using clusterprofiler for gsea and I am stuck in the OrgDb part. When I am performing my RNA-seq and DGE analysis, I used a gtf file from gencode (Grch38.p13) and my gene ids are "ENSG00000000003.15" like this. As in tutorials

organism = "org.Hs.eg.db"
BiocManager::install(organism, character.only = TRUE)
library(organism, character.only = TRUE)

I saw like above when organism info is parsed. But I think it is for hg19 (different sources) and I cannot run GSEA with that organism db. How and where can I find org db file which is from Gencode? Thanks in advance

clusterprofiler gsea organism • 722 views
ADD COMMENT
0
Entering edit mode

Your gene IDs are formatted as versioned ENSEMBL IDs. If you remove the trailing period and numbers (.15 in your example) you'll have the regular ENSEMBL IDs which should be present in the Org DB. If you post your code and the current error we can give more specific advice.

ADD REPLY
0
Entering edit mode

I don't think it's hg19 as the package always fetches current information from NCBI/Ensembl (GRCh38). It would have helped if there is a way to print sources directly within the package. Closest I see is "org.Hs.eg_dbInfo". In general, it takes information from here: https://ftp.ncbi.nlm.nih.gov/gene/DATA and if you look at this file: https://ftp.ncbi.nlm.nih.gov/gene/DATA/README_ensembl, NCBI assembly version GRCh38.p14 and ensembl assembly version is GRCh38.p13.

ADD REPLY

Login before adding your answer.

Traffic: 2774 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6