Selection of SNPs after imputation
1
0
Entering edit mode
3.7 years ago

Hello everyone

I am first time working on imputation of GWAS data. I have chromosome specific VCF files. In one of the chromosome file, I have 195276 SNPs with 293 individuals. These are the steps I followed

1) Upload of VCF on Michigan imputation server with selection of reference panel All steps such as Input Validation, Quality Control and Pre-phasing and Imputation worked without any error. In report it was defined:

Excluded sites in total: 1,532 Remaining sites in total: 339,099

As output I received chr.dose.vcf.gz

2) Next, I used PLINK to get the " "bed" and "bim" file format.

./plink --vcf chr.dose.vcf.gz --make-bed --double-id --biallelic-only --out chr_biallelic

Plink log file gave me information "4057885 variants and 293 people pass filters and QC".

Its big change in no of SNPs from 339,099 to 4057885.

3) Is it ok if I extract 339,099 SNPs from chr.dose.vcf.gz in order to stick only with the desired SNPs site ?

I will appreciate all the suggestions.

Thanks in advance A

gwas imputation • 974 views
ADD COMMENT
0
Entering edit mode

Maybe I'm not following right, but it looks like you have 339,099 sites that were directly typed that you uploaded to the imputation server. The server then imputed out to 4 millionish sites.

Why would you want to extract out your 339,099 SNPs after imputation, what was the point of imputing then?

I'm scratching my head a bit here, could you better describe what you are trying to achieve?

ADD REPLY
0
Entering edit mode
3.7 years ago
Yean ▴ 140

it depends on your goal, in my opinion.

If you perform GWAS, I think you should keep all variants because novel variants could be associated with your studied phenotype. but if you calculate a genetic risk score, it is fine to keep only desired SNPs.

ADD COMMENT

Login before adding your answer.

Traffic: 2322 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6