Question: Question about generate a mappability mask file for non-human species
0
gravatar for zhengchenfei
4 weeks ago by
zhengchenfei60
zhengchenfei60 wrote:

Dear All Recently, I'm learning how to generate a mask file for non-human species. So I followed the instruction on Heng Li's SNPable Regions website and I'm using human_b36_male.fa as an example. However, I have some issues when generating the mask file. When I check the final result file 'mask_35_50.fa', I found this result is incomplete. For example, chromosome 1 contains 4,120,830 lines in the reference file while only 117,740 lines remain in the final mask file. I have changed the version of bwa from 0.7.17 to 0. 5. 8, and re-index the reference file, but the same issues occurred. I have no idea what is wrong during generating. Would anybody please help me with that. Thank you for help

assembly • 78 views
ADD COMMENTlink written 4 weeks ago by zhengchenfei60

Can you calculate length of FASTA entries before and after masking, for example, for chromosome 1? Based on the documentation, applying the mask only changes the low mappability regions into lower cases, and the sequence length should stay same. Can you confirm this in your data?

ADD REPLYlink written 4 weeks ago by Vitis2.2k

Hi Vitis, I have confirmed the length of FASTA entries. The sequence length is shorter after masking. For chromosome 1, there are 247,249,800 characters, while only 4,120,900 characters remain after masking. I assumed there are some issues during mapping shorter read against the reference fasta file, but I don't know what is it since there is no error report and the masking process could be performed successfully. Thank you for your advice

ADD REPLYlink written 4 weeks ago by zhengchenfei60

Can you a make a reproducible case and report to SNPable regions or raise a possible issue? Without the exact same data and a reproducible example it will be hard to figure out what went wrong.

ADD REPLYlink written 4 weeks ago by Vitis2.2k

the data i used is the tutorial example file provided by SNPable regions website (ftp://ftp.sanger.ac.uk/pub/1000genomes/reference/). I have tried to connect with Heng Li, who is the author but no I got no answer.

ADD REPLYlink written 4 weeks ago by zhengchenfei60
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 798 users visited in the last hour