Question: How Many Human Genome Assemblies Are Avaliable?
gravatar for Alex
8.5 years ago by
Theodosius Dobzhansky Center for Genome Bioinformatics
Alex1.4k wrote:

How many human genomes assemblies are avaliable for analysis? On NCBI website I found three avaliable genomes assembled in chromosomes:

  1. the reference assembly
  2. the Celera assembly
  3. and diploid Venters's genome.

Additionaly there are three WGS assembly that are not assembled in chromosomes:

  1. Watson's genome
  2. African genome
  3. Asian genome

Are there any other avaliable assemblies that are not listed by NCBI?

ADD COMMENTlink modified 8.4 years ago by lh331k • written 8.5 years ago by Alex1.4k
gravatar for Jorge Amigo
8.5 years ago by
Jorge Amigo11k
Santiago de Compostela, Spain
Jorge Amigo11k wrote:

in case you mean "browseable" assemblies yes, as far as I am concerned these are all the publicly available ones to date.

but if you want human genome assemblies for deeper analysis, doesn't the 1000 Genomes data suit your needs? you can even consider digging into the major NGS repositories such as the american SRA or the european ENA.

ADD COMMENTlink written 8.5 years ago by Jorge Amigo11k
gravatar for Pierre Lindenbaum
8.5 years ago by
France/Nantes/Institut du Thorax - INSERM UMR1087
Pierre Lindenbaum120k wrote:

See the description of the track "Genome Variants" in the UCSC genome Browser:

This track displays variant base calls from the publicly released genome sequences of several individuals:

* 5 Sub-Saharan African genomes sequenced by Penn State University:
      o !Gubi (KB1),
      o G/aq'o (NB1),
      o !Ai (MD8),
      o D#kgao (TK1),
      o Archbishop Desmond Tutu (ABT), 
* 6 individuals from the 1000 Genome Project high-coverage pilot:
      o a CEU daughter and parents (NA12878, NA12891, NA12892)
      o a YRI daughter and parents (NA19240, NA19238, NA19239) 
* and independently published genomes:
      o Craig Venter,
      o James Watson,
      o Anonymous Yoruba individual NA18507,
      o Anonymous Han Chinese individual (YH, YanHuang Project),
      o Seong-Jim Kim (SJK),
      o Anonymous Korean individual (AK1),
      o Stephen Quake,
      o Anonymous Irish male,
      o Extinct Palaeo-Eskimo Saqqaq individual
ADD COMMENTlink written 8.5 years ago by Pierre Lindenbaum120k
gravatar for lh3
8.5 years ago by
United States
lh331k wrote:

I do not know how one would define "assembly". But in the sense of de novo assembly, 5 are publicly available:

  • The official human reference genome
  • Celera assembly
  • Venter
  • YanHuang
  • NA18507

In the sense of mapping assembly, there are very few. For all the sequencing projects in the public domain, you can always get the raw reads, sometimes the list of SNPs and occasionally the alignment, but these are not really mapping assembly. In my definition of mapping assembly, you have to know which regions are accessible and which are not, but this is rarely available.

I have processed some of the published data sets in a uniform way. For people who are interested, they are here.

ADD COMMENTlink written 8.5 years ago by lh331k
Please log in to add an answer.


Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 2213 users visited in the last hour