I am very much a beginner in genetic data analysis. I am recently trying to learn to perform GWAS in R through the article "A guide to genome-wide association analysis and post-analytic interrogation". During SNP imputation, the authors used SNP data on Chr16 for demonstration. The authors used read.pedfile function in snpStats package to load "chr16_1000g_CEU.ped" and "chr16_1000g_CEU.info" files into R (files publicly available from https://www.mtholyoke.edu/courses/afoulkes/Data/GWAStutorial/).
I wish to find 1000 g SNP data for other chromesomes. From ftp://ftp.1000genomes.ebi.ac.uk/vol1/ftp/phase1/analysis_results/integrated_call_sets/, I found vcf.gz and vcf.gz.tbi files associated with each chromosome. For example, for chromosome 16, I found "ALL.chr16.integrated_phase1_v3.20101123.snps_indels_svs.genotypes.vcf.gz" and "ALL.chr16.integrated_phase1_v3.20101123.snps_indels_svs.genotypes.vcf.gz.tbi".
My questions are:
Are the vcf.gz and vcf.gz.tbi files for Chr16 I found equivalent to the "chr16_1000g_CEU.ped" and "chr16_1000g_CEU.info" files the authors provided? If yes, I may just download SNP data for other chromosomes for my own GWAS.
I understand the vcf.gz file contains genotype information and vcf.gz.tbi contain position information. I tried to load these two files which I downloaded from 1000 g webpage into R but I failed. I also resorted to an 8-year-old post in Biostars (Loading 1000 Genomes Vcf Files In R) but it did not work. My guess is that the vcf.gz file is analogous to the "chr16_1000g_CEU.ped" in the paper and the vcf.gz.tbi file is analogous to the "chr16_1000g_CEU.info" file. But I did not find ways to convert vcf.gz to .ped and vcf.gz.tbi to .info before loading into R. Nor did I find methods that can load vcf.gz and vcf.gz.tbi directly into R. Any solution is welcome.
Thanks, Patrick Lv