Combining inhouse data with 1000 genome for PCA
2
0
Entering edit mode
9.4 years ago
ankita ▴ 20

For PCA analysis, I have common coordinates between Inhouse exome data and 1000 Genomes data (Phase 1). I want to retrieve genotypes for those common SNPs from VCF files in 1000 genomes [then convert to plink for smartPCA). I thought of a option of converting all VCF files in phase 1 to ped which is very memory intensive. What can be the possible solution for this problem?

VCF 1000Genome • 5.9k views
ADD COMMENT
0
Entering edit mode

thanks, I will try this :)

ADD REPLY
0
Entering edit mode

Consider using the 1000Genomes data for imputing the genotype of the SNPs missing in your dataset: http://www.1000genomes.org/faq/can-i-use-1000-genomes-data-imputation

ADD REPLY
1
Entering edit mode
9.4 years ago

PLINK 1.9 supports direct conversion of VCF to .bed+.bim+.fam , which should be readable by smartpca. For example,

plink --vcf ALL.chr1.phase1_release_v3.20101123.snps_indels_svs.genotypes.vcf.gz --out 1000g_chr1
ADD COMMENT
0
Entering edit mode
9.4 years ago
ankita ▴ 20

Hi

I am using vcf-concat to combine all VCF files from 1000 Genomes and then using plink 1.9. Is vcf-concat option is the only way for concatenation, is there any other way in plink 1.9 because it is much faster ?

ADD COMMENT
0
Entering edit mode

You can use plink --merge-list.

ADD REPLY
0
Entering edit mode

I have used --merge-list but it throws an error and warnings for snp inconsistencies. What I understand is merging is used when 2 files have same snps and you have to merge data for different individuals with same snps while I want just want to concatenate two vcf files from different chromosomes say chr1 and chr2 for making one single plink file, in short I want to concatenate all vcf files of 1000 g to make one plink file.

ADD REPLY
0
Entering edit mode

bcftools concat may be your best bet. (--merge-list also handles concatenation, but each variant needs to have a unique ID.)

ADD REPLY

Login before adding your answer.

Traffic: 3256 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6