Question: How To Convert Vcf File To Plink Ped Format?
4
gravatar for Ketan Padiya
8.6 years ago by
Ketan Padiya40
Ketan Padiya40 wrote:

Hi! Biostar readers

This is my first post to Biostar,

I have a VCF file containing variants, i want to convert to PLINK PED format.

I have tried using, vcftools --vcf *.vcf --plink

But when i input this file into WGAViewer it says "SNP and p value cannot be the same", Why?

Is there any other software to convert vcf to plink ped format?

Thanks.

vcftools • 28k views
ADD COMMENTlink modified 5.1 years ago by Lídia80 • written 8.6 years ago by Ketan Padiya40

could you post some lines of the output that you obtain with your vcftools command?

ADD REPLYlink written 8.4 years ago by Pablo Marin-Garcia1.8k
3
gravatar for Lídia
5.1 years ago by
Lídia80
Spain
Lídia80 wrote:

I have read that GATK has also a tool to do this conversion. It is called VariantsToBinaryPed. I have not used it yet but it looks useful and I will give it a try! 

(https://www.broadinstitute.org/gatk/gatkdocs/org_broadinstitute_gatk_tools_walkers_variantutils_VariantsToBinaryPed.html)

 

 

ADD COMMENTlink written 5.1 years ago by Lídia80

Has anyone used this successfully? I tried, but it asks me for a metadata file with an extension of .fam. Given that I am trying to create this very file (i.e. .bim, .bed, and .fam) it confused me. I couldn't find a good article to explain how this works! 

ADD REPLYlink written 4.0 years ago by HumeMarx20
1

I am still new to using PLINK and variant calling in general, so someone with more experience may correct me if needed. For my purposes I generally have a VCF file that I need to convert to PLINK for certain things. When I do this I manually create the .fam file for all my samples. I'm not sure of any other way as the information needed in the .fam file is not stored in the VCF file. It is meta information about your samples. This link here (https://www.cog-genomics.org/plink2/formats#fam) gives a good explanation of the .fam file format.

ADD REPLYlink written 3.5 years ago by alolex900

@alolex, Do you do this using the above mentioned tool from GATK VariantsToBinaryPed? Or with PLINK's --make-bed?

ADD REPLYlink written 16 months ago by gaelgarcia05190
2
gravatar for Pablo Marin-Garcia
8.4 years ago by
Spain
Pablo Marin-Garcia1.8k wrote:

Hello Ketan,

the vcftools 'plink' format is the ped ('pedigree') format that plink uses as primary INPUT. WGAViewer accepts the OUTPUT of plink (the association tests) and display you the pvalues in a genomic view.

A ped file (and its companion map file) does not contain any p-value. The p-values can be obtained doing the association test with plink (see the plink manual).

As you can see from the error you have "SNP and p value cannot be the same" WGAViewer wants a file with p-values not the ped files.

ADD COMMENTlink written 8.4 years ago by Pablo Marin-Garcia1.8k

hello,i want to know how can i convert the hapmap genotypes data to plink format? i use peas first to convert to normal format and then to plink_in format. it creats several output files but without map file. BUt the pedsnp file seems to be the same with map file.

But when i load into plink, it turns out to be wrong with message:ped collume wrong.

so could any body tell me how to do this work?

ADD REPLYlink written 8.3 years ago by J.F.Jiang750
1
gravatar for Thomas Johnson
5.7 years ago by
New Zealand
Thomas Johnson70 wrote:

Check out Plink 1.9 an alpha version of plink with VCF support.

https://www.cog-genomics.org/plink2/

ADD COMMENTlink written 5.7 years ago by Thomas Johnson70
1
gravatar for dweeks.pitt
5.1 years ago by
dweeks.pitt40
United States
dweeks.pitt40 wrote:

The Mega2 program can convert from VCF to PLINK format.

See:  http://watson.hgen.pitt.edu/docs/conversions/vcf_or_bcf_plink.html

ADD COMMENTlink written 5.1 years ago by dweeks.pitt40
0
gravatar for user56
7.3 years ago by
user56290
United States
user56290 wrote:

You can work with VCFs in R.

To load the file use:

file='e:/d/genome/yourVCF.txt'
v <- read.table(file,sep='\t',header = T,fileEncoding="utf-16")
str(v)

The UTF-16 encoding was particularly hard to troubleshoot. Eventually Notepad++ helped me to detect this encoding problem.

It correctly ignores the header lines and detects column headers as well.

ADD COMMENTlink modified 6 days ago by RamRS24k • written 7.3 years ago by user56290
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1521 users visited in the last hour