Mus musculus genome scaffold discrepancy
1
0
Entering edit mode
7 weeks ago
SJP • 0

Hi all,

I am working with the Mus musculus refseq genome GCF_000001635.27. On the NCBI webpage for the genome (https://www.ncbi.nlm.nih.gov/datasets/genome/GCF_000001635.27/) the description of the genome states there are 101 scaffolds, as does the assembly report. However, when I count '>' in the .fna file the count is 61, and when I click "view refseq sequences" there are 61 sequences listed on the webpage.

I am new to genomics and cannot for the life of me figure out where the 101 number has come from as the 61 sequences contains some with the description "unplaced scaffold" so it can't be simply that these types of sequences are excluded?

genome ncbi • 239 views
ADD COMMENT
0
Entering edit mode
7 weeks ago
Wayne ★ 2.1k

See How are genome assemblies generated and what are assembly levels?:

The 305 contigs cited there under 'Assembly statistics' were used to make the 101 scaffolds.

Sixty-two of the 101 scaffolds went into to making assemblies of the 21 chromosomes and the mitochondrial genome listed at the bottom there.
However, while 18 scaffolds were localized to chromosomes, they were not precisely localized along those chromosomes (see ' Unlocalized count' there in the 'Chromosomes' table), and 21 of the scaffolds remain completely unplaced at this time (indicated by the 'Note: This genome assembly includes 21 unplaced scaffolds' there under 'Chromosomes').
Plus, there remains 143 gaps in that assembly according to 'Gaps between scaffolds' there under 'Assembly statistics', and among the assembled portions, they will be represented with NNN's.

Another way to look at the accounting

21 assembled chromsomes
1 assembled mitochondrial genome
18 unlocalized, chromosomal scaffolds
21 unplaced scaffolds
----------------------
TOTALS 61 RefSeq entries (from the original 101 scaffolds)

ADD COMMENT

Login before adding your answer.

Traffic: 2571 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6