Adding SNP ID to original VCF file
1
0
Entering edit mode
10 weeks ago
rs146 • 0

Hello,

I am fairly new to bioinformatics and I'm stuck on how to go about this. I have been trying to prune snps based on LD from a large vcf file. I have mainly followed this really nice tutorial on how to do this in plink: https://evomics.org/learning/population-and-speciation-genomics/2016-population-and-speciation-genomics/fileformats-vcftools-plink/ However, when I have reverted my plink binary files back to vcf format, I am excluding the INFO and QUAL data in the vcf file. I had also looked into VCFtools to do this using:

vcftools --vcf  <original vcf file> --snps snps_ld_0.8.prune.in --recode --recode-INFO-all --out <new vcf>


But I end up with a blank vcf file or with only the header metadata. During the tutorial (linked above) it makes me create a chromosome map that generates id's for all the snps - do I need to add these to my original vcf file (which only has '.' in the ID columns), if so can anyone recommend how I would go about this? Thanks for your help.

plink SNP vcftools • 215 views
0
Entering edit mode
10 weeks ago

To preserve VCF QUAL/INFO data with plink, it is necessary to use plink 2.0 and its updated file format (--pfile/--make-pgen instead of --bfile/--make-bed). In particular, you must use plink2 --vcf for VCF-to-plink2 conversion; VCFtools does not have the necessary export function.