Question: Plink. Convert ped files of artificially mixed population into vcf using recode
gravatar for Mr Locuace
2.7 years ago by
Mr Locuace100
Mr Locuace100 wrote:

Hello, I would really apreciate to have some feedback to a problem. I am using Plink 1.9.

I would like to get phased genotypes from an artificially mixed human population.

I merged two populations from different sources using Plink (--merge), creating pop_A. I checked whether there were no problems with the strands and make sure to have the same SNPs in both populations, in addition to performing other controls. Then, I split pop_A into 22 ped/map files.

The phasing is done with Beagle, which requires vcf files, one per chromosome. For that, I converted the 22 ped/map files to vcf format using --recode vcf. However, in some websites it is recommended to use --recode vcf-iid instead, but I don't understand the difference between these commands. So my first question is: which of these commands should I use and why.

Now the second question. Since the two original populations come from different sources (POPRES and 1000 Genomes), is it OK to convert pop_A plink files to vcf just using --recode vcf(or vcf-iid), or should I consider other issues as well?

Thank you very much in advance.

phasing plink vcf • 1.2k views
ADD COMMENTlink modified 2.7 years ago by Wietje210 • written 2.7 years ago by Mr Locuace100
gravatar for Wietje
2.7 years ago by
Wietje210 wrote:

The options recoding options influence how the sample ID in the VCF file is composed, check the PLINK manual or this link for that:

"The 'vcf', 'vcf-fid', and 'vcf-iid' modifiers result in production of a VCFv4.2 file. 'vcf-fid' and 'vcf-iid' cause family IDs and within-family IDs respectively to be used for the sample IDs in the last header row, while 'vcf' merges both IDs and puts an underscore between them (in this case, a warning will be given if an ID already contains an underscore)."

The iid and fid options are mainly interesting when working with family sets - are your samples at all related? Does ancestry/ relationship play a role? Otherwise I would say you can easily go with the regular --recode vcf.

ADD COMMENTlink modified 2.7 years ago • written 2.7 years ago by Wietje210

Thank you @Wietje !. No, the samples are not related.

ADD REPLYlink written 2.7 years ago by Mr Locuace100
Please log in to add an answer.


Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1948 users visited in the last hour