Question: How unphased VCF is converted into ped file?
0
gravatar for jingjin2203
8 months ago by
jingjin22030 wrote:

Hi All,

I have some ddRADseq data from a diploid organism I'm working on.

I've generated an unphased VCF file using freebayes that I wanted to convert into PED file. I was wondering how does VCF to PED conversion deal with unphased VCF data? Because when I further converted PED to FASTA, each of the sample had two reads, and the two reads for each sample were different. So how does the conversion program distinguish two alleles at a heterozygous site for each read?

Hope my question makes sense. Any answers or comments will be appreciated!

Thanks!

ped freebayes phaseing ddrad vcf • 332 views
ADD COMMENTlink modified 8 months ago by Kevin Blighe30k • written 8 months ago by jingjin22030
0
gravatar for Kevin Blighe
8 months ago by
Kevin Blighe30k
Kevin Blighe30k wrote:

I presume that you mean the running of plink --vcf on your file, i.e., in order to convert it to PLINK PED format?

The latest implementation of PLINK ignores phasing information. In the heterozygous situation, all variant alleles become A1 whilst the reference alleles become A2. In the homozygote situation, variant alleles are obviously set to both A1 and A2.

Kevin

ADD COMMENTlink written 8 months ago by Kevin Blighe30k

Thanks, Kevin! Yes, that's exactly what I was asking. Really appreciated your kind reply! Just a follow up question, do you know how I can convert phased vcf file to plink ped format with the phasing information incorporated?
Thank you!

ADD REPLYlink written 8 months ago by jingjin22030

I'm not sure that phasing information is ever taken into account in PLINK. The person who will know is chrchang523

If you take a look here: https://www.cog-genomics.org/plink/1.9/input#vcf

--vcf loads a (possibly gzipped) VCF file, extracting information which can be represented by the PLINK 1 binary format and ignoring everything else (after applying the load filters described below). For example, phase and dosage information are currently discarded. (This situation will improve in the future, but we do not have plans to try to handle everything in the file.)

It says that phasing information is "discarded" ...

ADD REPLYlink modified 8 months ago • written 8 months ago by Kevin Blighe30k

Thank you, Kevin! Really appreciated it!

ADD REPLYlink written 8 months ago by jingjin22030
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 760 users visited in the last hour