VCF file phasing by SHAPEIT
1
1
Entering edit mode
4.0 years ago
seta ★ 1.7k

Hi everybody,

I would like to phase (just phasing, not imputation) vcf file containing about 1100 individuals (a given human population) derived from whole genome sequencing, the vcf file obtained by GATK. As I searched, SHAPEIT was mostly used; based on its manual, it requires genetic map for phasing, however, the provided link for genetic map is based on hapmap, hg37, which the link didn’t work (actually, an error says “The requested URL /genetics_software/shapeit/shapeit.html/files/genetic_map_b37.tar.gz was not found on this server”. Now, my questions are:

1) Could you please tell me where is genetic map?, I also need this map based on hg38, is there the genetic map for hg38, or how we can convert this map from hg37 to the related map for hg38?

2) In SHAPEIT manual, “read aware phasing” also described that takes bam and vcf file as input to extract the phase informative read (PIR) that used for phasing vcf file in the next step. So, the genetic map is no longer required, here. I think I should use this method (not that is based on genetic map) since I have the sequencing data, yes, am I right?

3) Also, please kindly let me know if it is possible to use a subset of interest from the vcf file of a given chromosome and the related part bam file extracted from whole bam file (so, not use whole bam and vcf file) for phasing?

Any suggestion and help would be highly appreciated.

vcf phasing whole genome shapiet • 4.8k views
0
Entering edit mode

I think you mean SHAPEIT ( http://mathgen.stats.ox.ac.uk/genetics_software/shapeit/shapeit.html ) instead of Shapiet, right?

0
Entering edit mode

Sorry, yes, I corrected it. Could you please kindly help me? I don't know why I cannot post the question in OXSTATGEN mailing list.

0
Entering edit mode

I would contact the authors directly. They have a duty to follow-up on these things if they are still advertising their program for current usage. Feel free to point them to my comment, here.

1
Entering edit mode

Thanks, Kevin for your nice point. At the moment, I would like to try Eagle 2 due to the more accuracy and speed compared to Shapeit as its authors said.

0
Entering edit mode

Hi, have you found a solution? I need this genetic map for hg38 too. Thanks.

0
Entering edit mode

0
Entering edit mode

Did you find the GRCh38 genetic map for input to SHAPEIT? Please post if you did. Thanks!

0
Entering edit mode

Please do not add answers unless you're answering the top level question. Use Add Comment or Add Reply instead as appropriate. I've moved your post to a comment this time, but please be more careful in the future.

0
Entering edit mode

If you are using the most recent version of shapeit (version 4 - which you should be), then the gr38 maps are are in the maps directory in the main folder.

1
Entering edit mode
3.6 years ago
miaowzai ▴ 370

I found the genetic map from 1KG Phase3 files under the IMPUTE2 ftp here: http://mathgen.stats.ox.ac.uk/impute/1000GP_Phase3/

0
Entering edit mode

2
Entering edit mode

Hi. The genetic maps are available from here: https://mathgen.stats.ox.ac.uk/impute/1000GP_Phase3.html

I have also saved the data on my online storage, for future reference.

1
Entering edit mode

Thanks for the update!