Main difference between format of Ensembl and USCS annotation
1
0
Entering edit mode
7.9 years ago
bxia ▴ 180

Ensembl use chr+number USCS use number

right?

is there any converter available?

RNA-Seq Ensembl UCSC • 2.0k views
ADD COMMENT
0
Entering edit mode

Very confused, then ensemble and gencode are also different?

I found a link on ensembl website link me to the gencode, I thought they are same

if they are different, basically, if I used ensembl genome file to build my index, I can't use gencode anotation file to anotate, right?

ADD REPLY
1
Entering edit mode

Yes. they are different. But i didn't means you can not use gencode annotation. If they use same system to represent chrosome, then you can use them.

ADD REPLY
0
Entering edit mode

No, they are not different. Ensembl = GENCODE.

ADD REPLY
1
Entering edit mode
7.9 years ago

Ensembl use chr+number USCS use number:

no, for example chrM vs MT

see @dpryan79 's https://github.com/dpryan79/ChromosomeMappings :

This repository contains chromosome/contig name mappings between UCSC <-> Ensembl <-> Gencode for a variety of genomes.

ADD COMMENT
1
Entering edit mode

Oh. I didn't notice it before. Why they use different strategy?

ADD REPLY
4
Entering edit mode

Why they use different strategy?

to keep our jobs :-)

ADD REPLY
0
Entering edit mode

Because they are different and independent projects which started at different time frames? Why do we keep seeing p53 if there is an official gene name for it i.e. TP53? At any rate, Ensembl uses the notation 1:XXXX-XXXX, whereas UCSC uses chr1:xxx-xxx. So it's the other way around from your OP. Please bear in mind that chromosome notations are not the only difference between UCSC and Ensembl when it comes to annotation of genes. Check the latest paper on the Ensembl gene annotation system for more details but in a nutshell Ensembl provides the merge gene set between the computational annotation by the Ensembl team and the manual annotation by the HAVANA team. This is the GENCODE set. Therefore, GENCODE = merged Ensembl gene set. HAVANA takes into account the Yale set of pseudogenes from Mark Gerstein's group, which gets incorporated by Ensembl. The GENCODE set seems to be the default choice in UCSC now, but in the not long distant past UCSC genes were the default instead, AFAIK. Now they seem to be called Previous Version of UCSC Genes but you can still see them if using the GRCh37 (=hg19) browser.

ADD REPLY

Login before adding your answer.

Traffic: 2264 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6