Question: VCF file phasing by SHAPEIT
Hi everybody,

I would like to phase (just phasing, not imputation) vcf file containing about 1100 individuals (a given human population) derived from whole genome sequencing, the vcf file obtained by GATK. As I searched, SHAPEIT was mostly used; based on its manual, it requires genetic map for phasing, however, the provided link for genetic map is based on hapmap, hg37, which the link didn’t work (actually, an error says “The requested URL /genetics_software/shapeit/shapeit.html/files/genetic_map_b37.tar.gz was not found on this server”. Now, my questions are:

1) Could you please tell me where is genetic map?, I also need this map based on hg38, is there the genetic map for hg38, or how we can convert this map from hg37 to the related map for hg38?

2) In SHAPEIT manual, “read aware phasing” also described that takes bam and vcf file as input to extract the phase informative read (PIR) that used for phasing vcf file in the next step. So, the genetic map is no longer required, here. I think I should use this method (not that is based on genetic map) since I have the sequencing data, yes, am I right?

3) Also, please kindly let me know if it is possible to use a subset of interest from the vcf file of a given chromosome and the related part bam file extracted from whole bam file (so, not use whole bam and vcf file) for phasing?

Any suggestion and help would be highly appreciated.

I think you mean SHAPEIT ( ) instead of Shapiet, right?

Sorry, yes, I corrected it. Could you please kindly help me? I don't know why I cannot post the question in OXSTATGEN mailing list.

I would contact the authors directly. They have a duty to follow-up on these things if they are still advertising their program for current usage. Feel free to point them to my comment, here.

Thanks, Kevin for your nice point. At the moment, I would like to try Eagle 2 due to the more accuracy and speed compared to Shapeit as its authors said.

