I'm trying to apply the 1000 Genomes masks (ftp://ftp-trace.ncbi.nlm.nih.gov/1000genomes/ftp/phase1/analysis_results/supporting/accessible_genome_masks) to the phase 3 vcfs using VCFtools with the --mask option. Strangely, my output files are always truncated down to a few hundred KB, whereas they should each span an entire chromosome. I am not sure why this is happening.
I rewrote the mask FASTA files into the format VCFtools requires, simply converting P (= pass) to 0 and everything else to 1. (I also tried using the reverse convention, as the VCFtools documentation for this option is confusing, and tried both --mask and --invert-mask for each of these). I have checked to make sure that all the information in the original FASTA files (number of characters, position of line breaks, etc.) is preserved. Does anyone know what could be causing this problem?
Thanks very much!