bioMart biotype assignment
0
0
Entering edit mode
9 weeks ago
imaparna27 • 0

Hello,

I am trying to assign bio-types to genes that I obtained after differential expression analysis. I used biomart for this purpose, but it retrieves only 54000 genes, whereas my expression data contains 57000 genes.

library(biomaRt)
ensembl = useEnsembl("ensembl",dataset="hsapiens_gene_ensembl", mirror = "useast")
genes <- df_RNA\$gene_id
G_list <- getBM(filters= "ensembl_gene_id", attributes= c("ensembl_gene_id",
"entrezgene_id", "hgnc_symbol", "gene_biotype"), values=substr(genes, 1, 15), mart=ensembl)


What can be the possible error here, also what alternative to biomart can be used?

biomart • 544 views
0
Entering edit mode

Did you use the same Ensembl version for both BioMart and to analyse differential expression?

0
Entering edit mode

Yes, GRCh38.p13 for expression analysis as well as for bio-type assignment.

0
Entering edit mode

The gene assembly has been 38.p13 since September 2019 but the gene annotation has been updated several times since then. What was the Ensembl version?

0
Entering edit mode

version 17

0
Entering edit mode

Version 17 was from about 2002

0
Entering edit mode

Sorry about the previous details, I re-confirmed, I obtained FPKM data from TCGA database and they've used GENCODEv22 for alignment that I guess corresponds to GRCh38.p2. I used same versions for all annotations. But, I am not sure why still some of the genes being are yet unannotated in my results, however information for few them is present on the Ensembl website.

1
Entering edit mode

You'll need to use Ensembl 80 to get GENCODE 22. Just include version=80 in your useEnsembl command.

0
Entering edit mode

Thanks Emily_Ensembl, this really helped.

0
Entering edit mode

Can you give some examples of the Gene IDs that are not being annotated?