Dear All Recently, I'm learning how to generate a mask file for non-human species. So I followed the instruction on Heng Li's SNPable Regions website and I'm using human_b36_male.fa as an example. However, I have some issues when generating the mask file. When I check the final result file 'mask_35_50.fa', I found this result is incomplete. For example, chromosome 1 contains 4,120,830 lines in the reference file while only 117,740 lines remain in the final mask file. I have changed the version of bwa from 0.7.17 to 0. 5. 8, and re-index the reference file, but the same issues occurred. I have no idea what is wrong during generating. Would anybody please help me with that. Thank you for help

Can you calculate length of FASTA entries before and after masking, for example, for chromosome 1? Based on the documentation, applying the mask only changes the low mappability regions into lower cases, and the sequence length should stay same. Can you confirm this in your data?

Hi Vitis, I have confirmed the length of FASTA entries. The sequence length is shorter after masking. For chromosome 1, there are 247,249,800 characters, while only 4,120,900 characters remain after masking. I assumed there are some issues during mapping shorter read against the reference fasta file, but I don't know what is it since there is no error report and the masking process could be performed successfully. Thank you for your advice

Can you a make a reproducible case and report to SNPable regions or raise a possible issue? Without the exact same data and a reproducible example it will be hard to figure out what went wrong.

the data i used is the tutorial example file provided by SNPable regions website ( I have tried to connect with Heng Li, who is the author but no I got no answer.

