vcf file format
2
0
Entering edit mode
6.4 years ago
johnja • 0

I used plink2 to make a vcf file from my .bed, .bim, and .fam files. This produced a vcf v4.3 file. Just realized that vcftools only supports 4.0, 4.1, and 4.2. How can I convert my file to a lower version?

vcf • 3.2k views
ADD COMMENT
0
Entering edit mode

Oh dear, VCF versions rarely have tangible change logs or backwards compatible tools. I had to go from VCF4.2 to a rough 4.1 by removing all problematic loci I found, no clue what's changed in 4.3 :-(

ADD REPLY
1
Entering edit mode

List of changes at the end of VCF spec file. Hard to decipher.

ADD REPLY
0
Entering edit mode

Thank you. I actually just downloaded plink version 1.9 instead of 2.0 and then made the vcf file again from the original data.

ADD REPLY
0
Entering edit mode

Have you actually tried whatever you want to do? It may still very well work. The changes in v4.3 aren't all that extreme and mostly consist of a few new tags and making hard lines about how certain tags should be formatted and named.

I'd try it first. At worst, you might have to change some of the header lines.

ADD REPLY
1
Entering edit mode
4.9 years ago
richyanicky ▴ 30

In plink2 there is a vcf-4.2 option .

  • 'vcf', 'vcf-4.2': VCF (default version 4.3). If PAR1 and PAR2 are present, they are automatically merged with chrX, with proper handling of chromosome codes and male ploidy. When the 'bgz' modifier is present, the VCF file is block-gzipped. The 'id-paste' modifier controls which .psam columns are used to construct sample IDs (choices are maybefid, fid, iid, maybesid, and sid; default is maybefid,iid,maybesid), while the 'id-delim' modifier sets the character between the ID pieces (default '_'). Dosages are not exported unless the 'vcf-dosage=' modifier is present. The following five dosage export modes are supported: 'GP': genotype posterior probabilities (v4.3 only). 'DS': Minimac3-style dosages, omitted for hardcalls. 'DS-force': Minimac3-style dosages, never omit. 'HDS': Minimac3-style phased dosages, omitted for hardcalls and unphased calls. Also includes 'DS' output. 'HDS-force': Always report DS and HDS.
ADD COMMENT
0
Entering edit mode
6.4 years ago
Ram 43k

EDIT: As genomax said, the VCFv4.3 doc seems to include the changes from 4.2 - I guess they did not make the same omission twice :-)

The tip mentioned here might be of use if you wish to find the difference between v4.3 and v4.2: https://bioinformatics.stackexchange.com/questions/344/whats-the-difference-between-vcf-spec-versions-4-1-and-4-2

you can download both specs in .tex format and do diff.

ADD COMMENT

Login before adding your answer.

Traffic: 1492 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6