Hi. I'm struggling with importing data into SeqMonk, I think it might be an issue with the assembly used in SeqMonk which doesn't work with the reference genome that I indexed and used for mapping.
Using Subread, my index was built upon this fasta of GRCh38 which says its the latest genomic fna but
not sure what is the specific version of this though...?
The assembly on seqmonk is the GRCh38.97.
I am assuming that my issue is with the genome because the warnings generated when trying to load data on seqmonk is a long list of this type of warning:
108788626 times] Couldn't extract a valid name from 'NC_000001.11' [81316054 times] Couldn't extract a valid name from 'NC_000002.12' [67257652 times] Couldn't extract a valid name from 'NC_000019.10' [66153570 times] Couldn't extract a valid name from 'NC_000012.12'
Help!
Assembly from SeqMonk appears to be from Ensembl (that
97
is likely Ensembl release version). Where as your reference genome seems to be from NCBI. Is that corret?In general patch levels do not change chromosome coordinates for a major genome build. Unless you care about contents of patches you would be fine using a general GRCh38 reference.
Yes, you're right. Reference genome is from NCBI.
Somehow, SeqMonk is generating that warning above which I am not quite sure why. Mapping of my samples to the NCBI reference seems to be okay I think with average bout 80% successful mapping.
If you only want to view the alignments then use IGV. It should work with both versions of genome (NCBI/Ensembl).