genotype imputation pipeline
1
1
Entering edit mode
7.7 years ago

Hello. I'm trying hard to go through the genotype imputation pipeline, unsuccessfully though. I have .bim .bed and .fam files and I need to select a subset of typed SNPs in these files and impute other SNPs. So far so good. I then prephase using shapeit and impute with impute2. I downloaded .hap.gz .legend.gz and genetic_map* files from the impute2 website (haplotype release date: October 2014). The last step would be running snptest, in order to get association results. Can anyone share his/her own pipeline with me? I get at some point this error: "ERROR: There are no type 2 SNPs after applying the command-line settings... ". I would greatly appreciate a step by step pipeline, since all the above software is already installed on my computer. Please let me know. Cheers, Alessandro

GWAS SNPs imputation • 4.8k views
ADD COMMENT
2
Entering edit mode

There is this a recent paper:

Molgenis-impute: imputation pipeline in a box

http://www.ncbi.nlm.nih.gov/pmc/articles/PMC4541731/

or a pipeline from github:

https://github.com/CNSGenomics/impute-pipe

ADD REPLY
0
Entering edit mode

Thanks, I'll look into it.

ADD REPLY
0
Entering edit mode

In any case, I would be very grateful if someone could share even just a fake little example with me. I start from .hap .legend and genetic_map* files I retrieve from Impute2 website and then I use my binary plink format files with genotypes as input. I think this is all I need, since I have installed all necessary software.

ADD REPLY
1
Entering edit mode
3.9 years ago
  1. Pre-phasing with SHAPEIT2 ( C: Phasing with SHAPEIT )
  2. Phased imputation with IMPUTE2 ( A: ERROR: You must specify a valid interval for imputation using the -int argument, )
  3. Conversion to VCF (see #2)

Kevin

ADD COMMENT
0
Entering edit mode

Hi Kevin,

I'm reading about imputation, but one thing is not clear. Assuming phasing and imputation of genotypes of a given population that is not present in the 1000 reference genome, could you please kindly let me know if just the VCF files generated from whole-genome sequencing of this specific population can be used as the reference (instead of using 1000 genome reference) for phasing and imputation by these tools?

Many thanks for all your help in advance

ADD REPLY
1
Entering edit mode

Yes, you can use any reference dataset, technically-speaking

ADD REPLY

Login before adding your answer.

Traffic: 2861 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6