Adding SNP ID to original VCF file
1
0
Entering edit mode
3.2 years ago
rs146 • 0

Hello,

I am fairly new to bioinformatics and I'm stuck on how to go about this. I have been trying to prune snps based on LD from a large vcf file. I have mainly followed this really nice tutorial on how to do this in plink: https://evomics.org/learning/population-and-speciation-genomics/2016-population-and-speciation-genomics/fileformats-vcftools-plink/ However, when I have reverted my plink binary files back to vcf format, I am excluding the INFO and QUAL data in the vcf file. I had also looked into VCFtools to do this using:

vcftools --vcf  <original vcf file> --snps snps_ld_0.8.prune.in --recode --recode-INFO-all --out <new vcf>

But I end up with a blank vcf file or with only the header metadata. During the tutorial (linked above) it makes me create a chromosome map that generates id's for all the snps - do I need to add these to my original vcf file (which only has '.' in the ID columns), if so can anyone recommend how I would go about this? Thanks for your help.

plink SNP vcftools • 1.5k views
ADD COMMENT
0
Entering edit mode
3.2 years ago

To preserve VCF QUAL/INFO data with plink, it is necessary to use plink 2.0 and its updated file format (--pfile/--make-pgen instead of --bfile/--make-bed). In particular, you must use plink2 --vcf for VCF-to-plink2 conversion; VCFtools does not have the necessary export function.

ADD COMMENT

Login before adding your answer.

Traffic: 1497 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6