ensembl biomart: why some genes are not found?
1
0
Entering edit mode
9.0 years ago
biocyberman ▴ 860

I am trying to retrieve exon coordinates for all genes in Agilent's Clinical Research Exome via Biomart. Some genes are not found in the results. For example, this AK6 is not found. Ensembl biomart seems to mistake AK6 for TAF9. Genecards: http://www.genecards.org/cgi-bin/carddisp.pl?gene=AK6 also does the same. While in the in Entrez says differently:

Entrez Gene summary for AK6 Gene:

This gene encodes a protein that belongs to the adenylate kinase family of enzymes. The protein has a nuclear localization and contains Walker A (P-loop) and Walker B motifs and a metal-coordinating residue. The protein may be involved in regulation of Cajal body formation. In human, AK6 and TAF9 (GeneID: 6880) are two distinct genes that share 5' exons. Alternative splicing results in multiple transcript variants. (provided by RefSeq, Sep 2013)

ensembl gene genecards biomart • 2.5k views
ADD COMMENT
2
Entering edit mode
9.0 years ago
Emily 23k

This looks like something has gone wrong at our end. We're looking into it.

Update: This has come into us via HGNC. Now on the case to HGNC.

Update: This was based on some old HGNC data that has since been fixed. It will be fixed for Ensembl release 81 (due in July) - I'm afraid the update missed release 80, due this month.

ADD COMMENT
0
Entering edit mode

@Emily_Ensembl: I actually believe otherwise. In Ensembl, TAF9 has two ENSG IDs: ENSG00000085231, ENSG00000273841; while on HGNC, they have one ID for each gene:

AK6: ENSG00000085231; Entrez:102157402; HGNC:49151; Link: http://www.genenames.org/cgi-bin/gene_symbol_report?hgnc_id=HGNC:49151

TAF9: ENSG00000273841; Entrez: 6880; HGNC:11542; Link: http://www.genenames.org/cgi-bin/gene_symbol_report?hgnc_id=HGNC:11542

So something happened on Ensembl side!

I think it has something to do with mapping interval/coordinates back to gene names.

ADD REPLY
0
Entering edit mode

What's happened is our links to HGNC come in via RefSeq and their links to RefSeq are wrong, so we've pulled in the wrong HGNCs. As I said, we're on the case.

ADD REPLY
0
Entering edit mode

Got it. I read too fast, sorry.

ADD REPLY
0
Entering edit mode

Thanks for the update @Emily_Ensembl. I currently in urgent need of a GTF files of GRCh37, and Rat Rn6 releases. Could you point out how I may get/make them without the possible problem with HGNC data, and before the release 81?

Thanks

ADD REPLY

Login before adding your answer.

Traffic: 2875 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6