Question: Combining inhouse data with 1000 genome for PCA
0
gravatar for ankita
4.0 years ago by
ankita20
India
ankita20 wrote:

For PCA analysis, I have common coordinates between Inhouse exome data and 1000 Genomes data (Phase 1). I want to retrieve genotypes for those common SNPs from VCF files in 1000 genomes [then convert to plink for smartPCA). I thought of a option of  converting all VCF files in phase 1 to ped which is very memory intensive. What can be the possible solution for this problem  ?

forum 1000 genome vcf file • 2.8k views
ADD COMMENTlink modified 4.0 years ago • written 4.0 years ago by ankita20

thanks, i will try this :)

ADD REPLYlink written 4.0 years ago by ankita20

Consider using the 1000Genomes data for imputing the genotype of the SNPs missing in your dataset: http://www.1000genomes.org/faq/can-i-use-1000-genomes-data-imputation

ADD REPLYlink written 4.0 years ago by Giovanni M Dall'Olio26k
1
gravatar for chrchang523
4.0 years ago by
chrchang5234.2k
United States
chrchang5234.2k wrote:

PLINK 1.9 supports direct conversion of VCF to .bed+.bim+.fam , which should be readable by smartpca.  For example,

plink --vcf ALL.chr1.phase1_release_v3.20101123.snps_indels_svs.genotypes.vcf.gz --out 1000g_chr1

ADD COMMENTlink modified 4.0 years ago • written 4.0 years ago by chrchang5234.2k
0
gravatar for ankita
4.0 years ago by
ankita20
India
ankita20 wrote:

Hi

I am using vcf-concat to combine all VCF files from 1000 Genomes and then using plink 1.9. Is vcf-concat option is the only way for concatenation, is there any other way in plink 1.9 because it is much faster ?

ADD COMMENTlink written 4.0 years ago by ankita20

You can use plink --merge-list.

ADD REPLYlink written 4.0 years ago by chrchang5234.2k
0
gravatar for ankita
4.0 years ago by
ankita20
India
ankita20 wrote:

I have used --merge-list but it throws an error and warnings for snp inconsistencies. What i understand is merging is used when  2 files have same snps and you have to merge data for different individuals with same snps while I want just want to concatenate two vcf files from different chromosomes say chr1 and chr2 for making one single plink file, in short i want to concatenate all vcf files of 1000 g to make one plink file.

ADD COMMENTlink written 4.0 years ago by ankita20

bcftools concat may be your best bet.  (--merge-list also handles concatenation, but each variant needs to have a unique ID.)

ADD REPLYlink written 4.0 years ago by chrchang5234.2k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 2379 users visited in the last hour