Question: How is the reference genome top level constructed?
1
gravatar for marongiu.luigi
17 months ago by
Germany, Mannheim, UMM
marongiu.luigi380 wrote:

Dear all,

I was wondering how is built the human reference genome top-level fasta file. I thought it was a single fasta file but I realized it is actually a multifasta, but it does not only contain the sequences from all the chromosomes (which are instead single fasta), but also contains several patches and scaffold. What is the function of these 'extra' files? why are not included directly in the chromosomes files?

Thank you

assembly genome • 453 views
ADD COMMENTlink modified 17 months ago by Emily_Ensembl19k • written 17 months ago by marongiu.luigi380
1

Typically not all sequences can be assigned to a chromosome. These extra sequences are put into additional files. I guess these are the ones you're referring to here.

ADD REPLYlink written 17 months ago by Jean-Karim Heriche20k
1

There are also alternate contigs in which the the chromosome location is known, but there is sufficient heterogeneity within the population at that location that alternate sequences were deemed necessary.

ADD REPLYlink written 17 months ago by d-cameron2.1k
3
gravatar for Emily_Ensembl
17 months ago by
Emily_Ensembl19k
EMBL-EBI
Emily_Ensembl19k wrote:

Dan will explain it for you.

ADD COMMENTlink written 17 months ago by Emily_Ensembl19k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1507 users visited in the last hour