Question: Rare variants imputation
0
gravatar for reds.nik
11 months ago by
reds.nik40
reds.nik40 wrote:

Hi,

I am currently working on rare variants association studies from whole genome seq data. I want to replicate my results in an array-genotyped cohort where rare variants have been previously imputed with Michigan imputation server based on Haplotype Reference Consortium panel. However, I found out that my top hits have been badly imputed and therefore I would like to re-impute rare variants based on a custom reference panel built on my own wgs data. I see that IMPUTE2 and SHAPEIT are widely suggested to this purpose. However I can't find any clear explanation about how to generate a reference panel. I have individuals vcf files and plink files and also a merged plink ped file including all the samples.

Could anybody kindly suggest me any tutorial/resource where I can learn how to do it? Is there another better strategy to impute those rare variants missed by Michigan server rather than generating my own reference panel?

Thank in advance for any help.

sequencing imputation • 434 views
ADD COMMENTlink modified 11 months ago by Kevin Blighe48k • written 11 months ago by reds.nik40
1
gravatar for Kevin Blighe
11 months ago by
Kevin Blighe48k
Kevin Blighe48k wrote:

I am neither yet to see a good tutorial for this, and the documentation is never great for these programs.

As you already have your data in PLINK format, you can export that straight to GEN format for IMPUTE2 with the --recode oxford command-line parameter. You should, then, be able to use this straight away as a reference panel in IMPUTE2.

Conversely, starting from the VCF stage, I would merge your samples into a single VCF and then use this script to convert VCF format to GEN.

I would also consider creating a merged reference panel that consists of your data plus 1000 Genomes. There have been posts on this in the past but it's not a frequent topic. Here, I am trying to link all of them:

Kevin

ADD COMMENTlink modified 11 months ago • written 11 months ago by Kevin Blighe48k
1

Thanks a lot Kevin for your suggestions, I managed to create my own reference panel with --recode oxford option in PLINK. However, as you said, given my relatively small sample size, merging my reference panel with 1000 Genomes data could be a better choice.

ADD REPLYlink written 10 months ago by reds.nik40
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1598 users visited in the last hour