Question: Ensembl: Which Human Genome Assembly And Annotation?
gravatar for Gregor Rot
5.2 years ago by
Gregor Rot430
Zurich, Switzerland
Gregor Rot430 wrote:

I am a little bit confused on which human genome assembly and annotation to use.

First issue I would like to use Ensembl assembly and annotation. My doubts are: if i use the latest release (v74) i tend to use the "primary assembly" which doesn't contain patches ( I do that because to my knowledge fix patches provide duplicate chromosomal sequence and when i would be mapping my reads i would get multiple hits in these patched regions (since the patches are provided in separate N-padded files). Maybe i am wrong here?

Second issue When i download the annotation over biomart (v74), i get genes on patched chromosomes (an example is SLC25A26 :;g=ENSG00000261657). I don't want to loose these genes (since there are no patched chromosomes in the primary assembly).

Is there a way to get the latest patched human genome assembly sequence (with patches already applied to the genomic sequence, and not in separate files)? Thanks

ensembl human assembly • 2.8k views
written 5.2 years ago by Gregor Rot430
gravatar for Emily_Ensembl
5.2 years ago by
Emily_Ensembl17k wrote:

What you're looking for is GRCh38. This is an updated version of the human genome, where all of the fix and novel patches are integrated into the genome, replacing the old primary sequence, along with other updates.

If you just want the sequence, you can download it from the Genome Reference Consortium now. If you're looking for our full annotation, you'll have to wait. Because of the work involved in annotating the genome, we expect to be ready in the summer, at the earliest. You can read more about our work with GRCh38 on our blog series - more articles to come.

written 5.2 years ago by Emily_Ensembl17k
