I am new to the field of Bioinformatics. I have downloaded files "ALL.chr1.phase1_release_v3.20101123.snps_indels_svs.genotypes.vcf.gz" and its tabix file "ALL.chr1.phase1_release_v3.20101123.snps_indels_svs.genotypes.vcf.gz.tbi" from ftp://ftp-trace.ncbi.nih.gov/1000genomes/ftp/release/20110521/.
1) The above vcf.gz contains compressed chromosome 1 sequence of 1092 individuals. How can get separate 1092 sequences in fasta format?
I read about vcf-consensus script but i am confused how to use it here ?
cat ref.fa | vcf-consensus file.vcf.gz > out.fa
Does The phase 1 release of 1000 genome project use the following reference genome (as mentioned in ftp://ftp-trace.ncbi.nih.gov/1000genomes/ftp/release/20110521/README.phase1_integrated_release_version3_20120430)
2) How can get entire genome sequence in fasta format for HG00096, HG00097 etc ?
Thanking you in advance.