17 months ago by
The top-level fasta file will include chromsomes, regions not assembled into chromosomes and N padded haplotype/patch regions. See more here: ftp://ftp.ensembl.org/pub/release-92/fasta/mus_musculus/dna/README. If you are only looking for reference genome assembly chromosome level sequences then use the primary_assembly.fa file.
The files in the dna_index directory are genomic sequence files which are bgzipped and tabix indexed (for more details on what this means see: http://www.htslib.org/doc/tabix.html). These are downloaded by the Variant Effect Predictor (VEP) installer to allow quicker VEP'ing. The fasta file without the .fai or .gzi suffix, although stated to be a different size, is identical to the fasta file in the fasta/mus_musculus/dna/ folder so you can download either and you'd get the same data.
We'll update the README files, or 'hide' the dna_index folder to avoid confusion between these files in the two folders. Thanks for bringing it to our attention!