Question: Difference between various GRCH 37 releases?
2
gravatar for morteza.mahmoudisaber
5.1 years ago by
Japan
morteza.mahmoudisaber70 wrote:

I have some Genomic coordinates based on hg19 which is equal to GRCH37 and I would like to analyze these coordinates using actual genome sequences. In Ensembl, There are several releases of Grch like GRCH37.70 and GRCH37.75 etc. I am wondering what is the actual difference between all these releases of GRCH37 ?! Are the coordinates the same for all these releases?!

genome • 6.0k views
ADD COMMENTlink modified 5.1 years ago by Gjain5.4k • written 5.1 years ago by morteza.mahmoudisaber70
2

duplicate of : What'S The Difference Between Two Versions Of The Same Assembly ?

ADD REPLYlink written 5.1 years ago by Pierre Lindenbaum125k
5
gravatar for Ying W
5.1 years ago by
Ying W4.0k
South San Francisco, CA
Ying W4.0k wrote:

This is the website of the Genome Reference Consortium, might be informative for you to poke around there

hg19 is typically thought of as a subset of GRCh37 with the GRCh37p# representing different minor / patch releases (changes do not affect coordinates)

Ensembl uses a different versioning system, you can find their list of changes here: http://www.ensembl.org/Help/ArchiveList though it seems like the latest release is mostly schema changes with some changes to annotations.

ADD COMMENTlink written 5.1 years ago by Ying W4.0k

Just to clarify further, versioning in Ensembl is just the annotation of the genes themselves. The primary assembly (ie coordinate system) is the same for anything that's GRCh37. A p.# will represent patches on top of the genome – things are added but the existing primary assembly is not altered in any way.

ADD REPLYlink written 5.1 years ago by Emily_Ensembl20k

Hi Emily, a question regarding your last sentence: how can the primary assembly coordinates remain unaltered if you ADD something? let's say in the middle of chromosome 17 there is a sequence which contains 4 repeats and not 3, as previously thought. Thus I publish a patch and now there are 4 repeats instead of only 3. I add one more repeat. Of course that changes every coordinate ... ?

ADD REPLYlink written 14 months ago by Marvin150

The three repeats are still there in the primary assembly. There's just a different version that you can (and should) instead for that locus which has four repeats.

ADD REPLYlink written 14 months ago by Emily_Ensembl20k

I don't understand this ... let's say in p6 there are 3 repeats on chromosome 17. Then in p7 there are 4 repeats. In the fasta file the sequence "> chr17" will thus be 1 repeat longer and thus all the nucleotides after that additional repeat will have a changed coordinate? You say that the three repeats are "still there" -> well of course they are still there, but there's a fourth one now, right? So the sequence becomes longer? What am I missing?

ADD REPLYlink written 14 months ago by Marvin150
1

Hello Marvin ,

have a look at my tutorial Which human reference genome should I use? . I tried to explain there what patches are. In short: If there is a fix/update for an existing sequence, this is not incorporated into the orginal sequence. It's getting it's own name/accession number and is included to the collection of sequences that build the reference genome.

fin swimmer

ADD REPLYlink written 13 months ago by finswimmer13k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1265 users visited in the last hour