Counting 705 fasta header characters ('>') in human genome
0
0
Entering edit mode
6 months ago
10mz1 ▴ 10

Does this mean that the haploid? assembly consists of 705 large molecules? I expected to find a number equal to the number of chromosomes plus the mitochondrial genome.

human-genome genetics • 631 views
ADD COMMENT
2
Entering edit mode

Not sure where you got the genome from but if you take a look at the headers that should give you an idea of what the (705 - Normal Chr/MT) listings are. Most should be unplaced contigs (marked chrUn), contigs that are unlocalized in a specific spot (>chr22_KI270738v1_random) etc.

ADD REPLY
0
Entering edit mode
ADD REPLY
2
Entering edit mode

A fasta file is always "haploid" in terms of that regardless of ploidy you always have a single sequence per chromosome. See for a read on other components of the reference genome beyond the "standard" chromosomes: https://gatk.broadinstitute.org/hc/en-us/articles/360041155232-Reference-Genome-Components

ADD REPLY
1
Entering edit mode

There is no such thing as the human genome. There are various human reference genomes (different builds, different versions) and you need to pick the one that fits your application. If there are many ALT contigs included, or it is not a reference genome but an individual assembly, 705 scaffolds might be an accurate number.

You will have to share more info on the genome if you want more specific replies.

ADD REPLY
0
Entering edit mode

The latest human genome 38 reference genome (GRCh38): https://www.ncbi.nlm.nih.gov/genome/guide/human/

ADD REPLY

Login before adding your answer.

Traffic: 1747 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6