GWAS imputation and quality control
2.7 years ago
Hi there.

I'm new to GWAS and have read some papers and tutorials in order to figure out the steps for analysis. Hopefully, biostars had some good information on how to start, but I feel the need to repeat what I have learned in order to make sure I know the steps well.

As I know, the raw output file from array platforms are intensity .idat files. They can be converted to .gtc files to have a faster genotype calling (usually genomestudio can be used for genotype calling). Imputation is the second step and then quality control is done on the data. Then the files are converted to standard file formats (.bed, .bam, .fam and so on) for PLINK or other appropriate software in order to perform association studies.

1- Am I right? Is that all?

2- When the array center has genotyped about 800000 SNPs and says they are performing the imputation, what does it mean? Does it mean that they provide us with imputed data so that we don't need to perform this step again?

3- Do array centers perform the quality control too? So the data delivered to us are in standard format ready for association study on PLINK?

Sorry if my questions seem so simple or naive.

Thanks a lot.

GWAS SNP SNP array imputation QC • 1.5k views
I think you need to contact the "array centers" with these questions.


