I wish to align my mouse WES fastq files using a GRCM38 reference genome. I found the genome in the USNC portal (https://hgdownload.soe.ucsc.edu/goldenPath/mm10/snp142Mask/). However, the fasta files are available for individual chromosomes as far as I understand. How/where to get the WES fasta file for mus musculus? Also, once aligned, I wish to annotate the variants using snpEff. But the GRCm38 genome in their database is a different version (CRCm38.75). Will this cause 'ERROR_CHROMOSOME_NOT_FOUND'
error?
This is not a forum post, but a question.
agreed
By using
sra-explorer
(search forWES mouse
) . Assuming byWES
you are referring to whole exome sequence. You are not going to get a fasta file. You will need to get the original fastq reads and do the analysis yourself.NCBI Datasets is the new experimental location to get large datasets like genomes.
You are looking at SNP-filtered files. You are probably looking for the more general version of the genome, which is here: https://hgdownload.soe.ucsc.edu/goldenPath/mm10/bigZips/