How to filter Michigan genotype imputation results
1
1
Entering edit mode
3.6 years ago
robjohn70000 ▴ 130

Hi,

I have just downloaded imputation result files from the Michigan Imputation Server. The files include ".info.gz" and ".dosage.gz". I will like to filter the genotype results by R^2 > 0.8 to obtain good quality imputed genotypes. I'm new to this kind of analysis and not sure how to proceed from here. Can someone please advise me on how to do this?

Here is .info example:

SNP     REF(0)  ALT(1)  ALT_Frq MAF     AvgCall Rsq     Genotyped       LooRsq  EmpR    EmpRsq  Dose0   Dose1
1:62246 C       T       0.30621 0.30621 0.69406 0.14963 Imputed -       -       -       -       -
1:62209 T       G       0.25723 0.25723 0.75622 0.11694 Imputed -       -       -       -       -

Here is .dosage example:

##fileformat=VCFv4.1
##filedate=2018.4.11
##source=Minimac3
##contig=<ID=10>
##FILTER=<ID=GENOTYPED,Description="Marker was genotyped AND imputed">
##FILTER=<ID=GENOTYPED_ONLY,Description="Marker was genotyped but NOT imputed">
##FORMAT=<ID=GT,Number=1,Type=String,Description="Genotype">
##FORMAT=<ID=DS,Number=1,Type=Float,Description="Estimated Alternate Allele Dosage : [P(0/1)+2*P(1/1)]">
##FORMAT=<ID=GP,Number=3,Type=Float,Description="Estimated Posterior Probabilities for  Genotypes 0/0, 0/1 and 1/1 ">
##INFO=<ID=AF,Number=1,Type=Float,Description="Estimated Alternate Allele Frequency">
##INFO=<ID=MAF,Number=1,Type=Float,Description="Estimated Minor Allele Frequency">
##INFO=<ID=R2,Number=1,Type=Float,Description="Estimated Imputation Accuracy">
##INFO=<ID=ER2,Number=1,Type=Float,Description="Empirical (Leave-One-Out) R-square (available only for genotyped variants)">
#CHROM  POS     ID      REF     ALT     QUAL    FILTER  INFO    FORMAT Sample1 Sample2

However, column "INFO" that contains R2 in the dosage file has this format:

AF=0.00036;MAF=0.00036;R2=0.00035
AF=0.08734;MAF=0.08734;R2=0.18100
Imputation GWAS QC RSQ • 4.1k views
ADD COMMENT
0
Entering edit mode

what does it look like. Provide an example.

ADD REPLY
1
Entering edit mode
3.5 years ago
GK1610 ▴ 100

try this

file=dosage.vcf.gz

bcftools view -i 'R2>.8' -Oz $file > $file.filtered.vcf.gz; tabix -p vcf $file.filtered.vcf.gz;

ADD COMMENT
0
Entering edit mode

Hi,

can you please tell me what .dose.vcf.gz.filtered.vcf.gz.tbi output file represents? Do I suppose to proceed with my subsequent analysis with .dose.vcf.gz.filtered.vcf.gz? I am planning to run GWAS with plink on this files.

Thanks

ADD REPLY

Login before adding your answer.

Traffic: 2408 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6