Why Are There Ambiguous (N) Bases (Gaps) In The Human Genome
2
2
Entering edit mode
11.1 years ago
siyu ▴ 150

can someone explain for me that why there are a few gaps in reference genome??

thanks!

reference genome • 9.2k views
ADD COMMENT
1
Entering edit mode

You got to be a lot more specific. What do you mean by gaps? Long sequences of N's? Gaps when you align your gene to the reference? What genome are you looking at?

ADD REPLY
0
Entering edit mode

Long sequences of N. thanks!

ADD REPLY
5
Entering edit mode
11.1 years ago
Ian 6.0k

The simple answer is that certain stretches of a genome contain sequence that is difficult to sequence, mainly due to repetitive regions, tracks of the same base, GC composition, closed DNA, etc. Searching Google for "gaps genome" brings up a whole host of references about the causes of gaps and the attempts to close them.

ADD COMMENT
4
Entering edit mode
11.1 years ago

User deanna.church gave this answer for a similar question related to mouse genome on biostar.

The Genome Reference Consortium (http://genomereference.org) attempts to model biological gaps in the assemblies that we produce. Unfortunately, in the current assemblies, the models for both centromeres and telomeres are rather poor so they just consist of a run of Ns. We don't have good estimates of mouse telomere/centromere size, so we use a default of 3M Ns for these regions. This information is marked up in the AGP files that define the assembly: mouse: ftp://ftp.ncbi.nlm.nih.gov/genbank/genomes/Eukaryotes/vertebrates_mammals/Mus_musculus/GRCm38.p1/ human: ftp://ftp.ncbi.nlm.nih.gov/genbank/genomes/Eukaryotes/vertebrates_mammals/Homo_sapiens/GRCh37.p11/

Note: even within the euchromatic regions there can be long runs of Ns representing gaps that we can't fill yet. In many cases we do have a good size estimate for the gap- typically based on experimental evidence like comparison to an optical map. For human, the problem is that some of the euchromatic gaps are polymorphic, so the size of the gap really depends on the individual you are assessing.

Post: No reads ever map to first 3Million bases of chromosomes in mouse Genome! Why?

ADD COMMENT
0
Entering edit mode

Thanks who ever edited the username and the link to the post.

ADD REPLY

Login before adding your answer.

Traffic: 2566 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6