Question: Comparing coordinates in different ensembl annotation version
0
gravatar for nitandressa
5 weeks ago by
Germany
nitandressa0 wrote:

Hello all!

So I have two experiments done on GRCh38, however, my collaborator and I used different ensembl gene annotation version to obtain our coordenates of interest. He used ensembl version 98 and I used ensembl version 95. So, we have this:

grep "ENSG00000237491" Homo_sapiens.GRCh38.98.chr.gtf | grep -P '\bgene\b' (Him)
1   havana  gene    778747  810065  .   +   .   gene_id "ENSG00000237491"; gene_version "10"; gene_name "LINC01409"; gene_source "havana"; gene_biotype "lncRNA";

and this:

$ grep "ENSG00000237491" Homo_sapiens.GRCh38.95.chr.gtf | grep -P '\bgene\b' (Me)
1   havana  gene    778770  810060  .   +   .   gene_id "ENSG00000237491"; gene_version "8"; gene_name "AL669831.5"; gene_source "havana"; gene_biotype "lincRNA";

My question is how can I compare these regions to find out if they correspond to the same thing? We are looking for differences in splicing pattern (so, intron retention, exon skipping, alternative 5' and/or 3' end usage, these stuff). Only the ensembl version is different, the paired-end illumina read alignment is the same.

Thanks in advance!

Best!

gene annotation ensembl • 130 views
ADD COMMENTlink modified 5 weeks ago by RamRS30k • written 5 weeks ago by nitandressa0

no expert in human stuff but are there no conversion tables between different version available? (that links ID from version X to version Y)

ADD REPLYlink written 5 weeks ago by lieven.sterck8.5k

I know some conversion software but that's when the annotations are in different genome versions. I don't know any table when they are on the same genome version.

ADD REPLYlink written 5 weeks ago by nitandressa0

I am getting no hits if I try to search for that ENSG ID using Jan2019 ensembl archive site, Ensembl release 95. With Sept2019 archival site, Ensembl release 98 location is actually different than what you have above.

MSL3 (Human Gene)
ENSG00000005302 X:11758159-11775772:1
MSL complex subunit 3 [Source:HGNC Symbol;Acc:HGNC:7370]
ADD REPLYlink modified 5 weeks ago • written 5 weeks ago by genomax89k

Sorry! i gave a bad example. I updated for a better one

ADD REPLYlink written 5 weeks ago by nitandressa0

Even with ENSG00000237491, I get no hits from Jan2019 archival site but do from Sept2019 Archival site.

I think the difference may be because these genes are from havana annotation. They may have deprecated the older annotation after a newer version came out.

While Ensembl help desk people stop by Biostars periodically you may want to put a ticket in with their helpdesk to see what the official explanation is. Post it here as an answer when you do.

ADD REPLYlink modified 5 weeks ago • written 5 weeks ago by genomax89k

Please use the formatting bar (especially the code option) to present your post better. You can use backticks for inline code (`text` becomes text), or select a chunk of text and use the highlighted button to format it as a code block. I've done it for you this time.
code_formatting

ADD REPLYlink written 5 weeks ago by RamRS30k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 615 users visited in the last hour