Ensembl: Which Human Genome Assembly And Annotation?
1
1
Entering edit mode
10.3 years ago
Gregor Rot ▴ 540

I am a little bit confused on which human genome assembly and annotation to use.

First issue I would like to use Ensembl assembly and annotation. My doubts are: if i use the latest release (v74) i tend to use the "primary assembly" which doesn't contain patches (ftp://ftp.ensembl.org/pub/release-74/fasta/homo_sapiens/dna/). I do that because to my knowledge fix patches provide duplicate chromosomal sequence and when i would be mapping my reads i would get multiple hits in these patched regions (since the patches are provided in separate N-padded files). Maybe i am wrong here?

Second issue When i download the annotation over biomart (v74), i get genes on patched chromosomes (an example is SLC25A26 : http://www.ensembl.org/Homo_sapiens/Gene/Summary?db=core;g=ENSG00000261657). I don't want to loose these genes (since there are no patched chromosomes in the primary assembly).

Is there a way to get the latest patched human genome assembly sequence (with patches already applied to the genomic sequence, and not in separate files)? Thanks

ensembl human assembly • 4.4k views
ADD COMMENT
2
Entering edit mode
10.3 years ago
Emily 23k

What you're looking for is GRCh38. This is an updated version of the human genome, where all of the fix and novel patches are integrated into the genome, replacing the old primary sequence, along with other updates.

If you just want the sequence, you can download it from the Genome Reference Consortium now. If you're looking for our full annotation, you'll have to wait. Because of the work involved in annotating the genome, we expect to be ready in the summer, at the earliest. You can read more about our work with GRCh38 on our blog series - more articles to come.

ADD COMMENT

Login before adding your answer.

Traffic: 2898 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6