Difference between ENSEMBL releases
1
0
Entering edit mode
15 months ago
gernophil ▴ 80

Hey everyone,

I know there already are a lot of threads about this, but I still don't get it. I used ENSEMBL Release 98 for my analysis (variant calling) and I am now wondering what would have been the difference, if I had used a newer one.

If I understand this correctly, the underlying genome sequence (GRCh38.p13) is still the same and the different releases just have different annotations. And also the ENSEMBL Release number is not specific for the human genome, but for the whole ENSEMBL database.

Q0: Am I correct to this point? Q1: So, what is let's say the difference between GRCh38.p13 and GRCh38.p12?

I used BWA MEM to map reads to the reference genome.

Q2: If I use different ENSEMBL releases here, would the mapping differ? To my understanding this just depends on the sequence and not the annotation.

For a different analysis I filtered some genes from these BAM files with a BED file that I created from GENCODE v42. Could I do this or should I have used GENCODE v32 (or should GENCODE v42 still work regardless, of the ENSEMBL Release).

Q3: So, can someone tell me the difference, of the different releases? I still find the information available online a bit confusing.

Thanks for your help and patience :).

release genome ENSEMBL • 1.2k views
ADD COMMENT
0
Entering edit mode

Thanks for these answers. Just to clarify: Release 98 uses GRCh38.p13 and so does the recent Release 108. However, there still might be slight differences in the underlying assembly?

ADD REPLY
3
Entering edit mode

If they both use GRCh38.p13, they have the same assembly (GRCh38.p13 = GRCh38.p13), just the annotations of this assembly would be different.

ADD REPLY
4
Entering edit mode
15 months ago

Q0: I wouldn't rely on different Ensembl versions always using the same underlying assembly. They occasionnaly change. The Ensembl release version is indeed associated with the whole Ensembl database.
Q1: GRCh38.p13 and GRCh38.p12 are different patches to the human genome assembly version GRCh38.
Q2: It depends on what you map to what. If you map some sequences from your experiments to a genome assembly then this doesn't depend on Ensembl. However if now you want to retrieve annotations for some regions, you need to make sure that the Ensembl annotations are made on the same assembly you used for mapping. I believe that GENECODE annotations are actually linked to a specific Ensembl versions so if you use GENECODE v42, you should be using the Ensembl version associated with it.
Q3: The differences between Ensembl releases are summarised on their blog under release announcements. Each release corresponds to updated annotations, sometimes changes are minimal, sometimes a bit more extensive, it depends on the genome.

The way to work with this is to pick a reference for your project and stick with it. If the project spans several years, you may eventually want to run your whole analysis once again using a recent version at the end. The key is to not mix and match references during your project or you'll end up in a lot of trouble reconciling differences.

ADD COMMENT
3
Entering edit mode

Just to add to Jean-Karim Heriche's excellent answer, the following table in the Ensembl documentation will help you to match up the genome assembly presented in each Ensembl release: https://www.ensembl.org/info/website/archives/assembly.html

ADD REPLY

Login before adding your answer.

Traffic: 3873 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6