Question: Plink. Convert ped files of artificially mixed population into vcf using recode
0
gravatar for Mr Locuace
16 months ago by
Mr Locuace90
Chile
Mr Locuace90 wrote:

Hello, I would really apreciate to have some feedback to a problem. I am using Plink 1.9.

I would like to get phased genotypes from an artificially mixed human population.

I merged two populations from different sources using Plink (--merge), creating pop_A. I checked whether there were no problems with the strands and make sure to have the same SNPs in both populations, in addition to performing other controls. Then, I split pop_A into 22 ped/map files.

The phasing is done with Beagle, which requires vcf files, one per chromosome. For that, I converted the 22 ped/map files to vcf format using --recode vcf. However, in some websites it is recommended to use --recode vcf-iid instead, but I don't understand the difference between these commands. So my first question is: which of these commands should I use and why.

Now the second question. Since the two original populations come from different sources (POPRES and 1000 Genomes), is it OK to convert pop_A plink files to vcf just using --recode vcf(or vcf-iid), or should I consider other issues as well?

Thank you very much in advance.

phasing plink vcf • 750 views
ADD COMMENTlink modified 16 months ago by Wietje200 • written 16 months ago by Mr Locuace90
1
gravatar for Wietje
16 months ago by
Wietje200
Germany
Wietje200 wrote:

The options recoding options influence how the sample ID in the VCF file is composed, check the PLINK manual or this link for that:

"The 'vcf', 'vcf-fid', and 'vcf-iid' modifiers result in production of a VCFv4.2 file. 'vcf-fid' and 'vcf-iid' cause family IDs and within-family IDs respectively to be used for the sample IDs in the last header row, while 'vcf' merges both IDs and puts an underscore between them (in this case, a warning will be given if an ID already contains an underscore)."

https://www.cog-genomics.org/plink/1.9/data

The iid and fid options are mainly interesting when working with family sets - are your samples at all related? Does ancestry/ relationship play a role? Otherwise I would say you can easily go with the regular --recode vcf.

ADD COMMENTlink modified 16 months ago • written 16 months ago by Wietje200

Thank you @Wietje !. No, the samples are not related.

ADD REPLYlink written 16 months ago by Mr Locuace90
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1374 users visited in the last hour