Difference in Human reference build p.10 and p.13
0
0
Entering edit mode
3.8 years ago
bioinfo_ga ▴ 70

Hi, For Human analysis as there has been update in the reference genome (GRCh38)build version from p10 to p13, i observed a drastic change in terms of the genome size as p.10 was 3 GB whereas p13 is 60 GB, what is the reason?

Thanks.

RNA-Seq rna-seq • 1.1k views
ADD COMMENT
1
Entering edit mode

i observed a drastic change in terms of the genome size as p.10 was 3 GB whereas p13 is 60 GB,

That is because you are mixing up "primary" assembly with one that contains "alternate haplotypes".

---------
TOPLEVEL
---------
These files contains all sequence regions flagged as toplevel in an Ensembl schema. This includes chromsomes, regions not assembled into chromosomes and N padded haplotype/patch regions.

--------------
PRIMARY ASSEMBLY
-----------------
Primary assembly contains all toplevel sequence regions excluding haplotypes and patches. This file is best used for performing sequence similarity searches where patch and haplotype sequences would confuse analysis.

ADD REPLY
0
Entering edit mode

ohh i see, ok let me see both the files. Thanks for the help.

ADD REPLY
0
Entering edit mode

From where do you want to take your Human Reference Genome ? Because on NCBI the GRCh38.p10.fna compressed is 911MB and the GRCh38.p13.fna is 920MB

Patch differences from the previous one are inside README_patch_release.txt

ftp://ftp.ncbi.nlm.nih.gov/genomes/all/GCA/000/001/405/GCA_000001405.28_GRCh38.p13/

ADD REPLY
0
Entering edit mode
ADD REPLY
0
Entering edit mode

The Homo_sapiens.GRCh38.dna.toplevel.fa.gz for GRCh38.p13 is 1.0GB

Also, Ensembl releases come from The Genome Reference Consortium

ADD REPLY

Login before adding your answer.

Traffic: 1693 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6