VCF to Plink files
1
0
Entering edit mode
14 months ago
hi.there • 0

Hello,

I am hoping somebody with experience with plink could help. I am trying to generate plink .bim, .fam and .bed files from a .vcf (one with variants filtered out and one that keeps the variants) and have toyed around with a couple of different commands that I found on biostars posts and google.

The documentation of going from .vcf to plink files is a bit more sparse so I'd like to check with more experienced researchers here if I am proceeding correctly.

My outcomes have fallen into two camps. For .fam files, a file was generated with an --allow-extra-chr flag at the end. For both the .bim and .bed files, I get an error:

Error: out.hg38NoVariants-temporary.pvar.zst has a split chromosome. Use
--make-pgen + --sort-vars to remedy this.

Below are the commands I am trying and the output/errors I am receiving. I would be very appreciative if somebody could tell me if my .fam files are correct and what to do to successfully generate all files including how exactly to use "--make-pgen" and "--sort-vars".


Are these producing the correct .fam files?

./plink2 --vcf out.hg38KeepVariants.vcf --make-just-fam --out out.hg38KeepVariants --allow-extra-chr

Writing out.hg38KeepVariants.fam ... done.


My .bed command asks to add a --allow-extra-chr flag but after adding the flag, there is an error:

./plink2 --vcf out.hg38NoVariants.vcf --make-bed --out out.hg38NoVariants    

Error: Invalid chromosome code '15_KI270727v1_random' on line 382274 of --vcf
file.
(Use --allow-extra-chr to force it to be accepted.)

.... now with added --allow-extra-chr flag.

./plink2 --vcf out.hg38NoVariants.vcf --make-bed --out out.hg38NoVariants --allow-extra-chr

Error: out.hg38NoVariants-temporary.pvar.zst has a split chromosome. Use
--make-pgen + --sort-vars to remedy this.

...with or without a flag, generating a .bim file causes a problem.

./plink2 --vcf out.hg38NoVariants.vcf --make-just-bim --out out.hg38NoVariants

Error: out.hg38NoVariants.vcf has a split chromosome. Use --make-pgen +
--sort-vars to remedy this.

I've preprocessed data before but never SNP data. Again, if anybody has experience with this pipeline, I'd appreciate your help. Thank you.

vcf pink pgen psam • 2.3k views
ADD COMMENT
1
Entering edit mode
14 months ago

Have you tried doing what the error message says?

(Note that I’m writing this as an answer rather than just a comment. This is intentional.)

ADD COMMENT
0
Entering edit mode

Thank you. I added --allow-extra-chr and a .fam file was made. Is that output correct? ./plink2 --vcf out.hg38KeepVariants.vcf --make-just-fam --out out.hg38KeepVariants --allow-extra-chr

I then used your advice and used the command:

./plink2 --vcf out.hg38NoVariants.vcf --make-pgen --out out.hg38NoVariants --allow-extra-chr --sort-vars

...which generated a .pgen file.

I then used: ./plink2 --pfile out.hg38NoVariants --make-just-bim --out out.hg38NoVariants --allow-extra-chr

....and

./plink2 --pfile out.hg38KeepVariants --make-bed --out out.hg38KeepVariants --allow-extra-chr

...which both ran through without error. Are they correct though?

ADD REPLY

Login before adding your answer.

Traffic: 1578 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6