I am having the following issue. I used STACKS to produce a .vcf file and it does not contain header for contigs, and I need this in order to edit the ID column which is all 0s at the moment (see below, well, it is a bit mixed up). This is how the vcf file looks like (header and a few more rows):
##fileformat=VCFv4.2 ##fileDate=20191104 ##source="Stacks v2.41" ##INFO=<ID=AD,Number=R,Type=Integer,Description="Total Depth for Each Allele"> ##INFO=<ID=AF,Number=A,Type=Float,Description="Allele Frequency"> ##INFO=<ID=DP,Number=1,Type=Integer,Description="Total Depth"> ##INFO=<ID=NS,Number=1,Type=Integer,Description="Number of Samples With Data"> ##FORMAT=<ID=AD,Number=R,Type=Integer,Description="Allele Depth"> ##FORMAT=<ID=DP,Number=1,Type=Integer,Description="Read Depth"> ##FORMAT=<ID=HQ,Number=2,Type=Integer,Description="Haplotype Quality"> ##FORMAT=<ID=GL,Number=G,Type=Float,Description="Genotype Likelihood"> ##FORMAT=<ID=GQ,Number=1,Type=Integer,Description="Genotype Quality"> ##FORMAT=<ID=GT,Number=1,Type=String,Description="Genotype"> ##INFO=<ID=loc_strand,Number=1,Type=Character,Description="Genomic strand the corresponding Stacks locus aligns on"> #CHROM POS ID REF ALT QUAL FILTER INFO FORMAT Lib1_3A Lib1_3B Lib1_3C Lib1_3D Lib1_3E Lib1_3F Lib1_4A Lib1_4B Lib1_4C Lib1_4D Lib1_4E Lib1_4F Lib1_7A Lib1_7A2 Lib1_7D Lib1_7D2 Lib1_7E Lib1_7F scaffold_45 10 0 A C 0
And when I use this option to edit the #IDs (I want IDs to be like scaffold_45_10, which is scaffold _CHROM_POS):
bcftools annotate --set-id +'%CHROM\_%POS' myfilename.vcf > new.vcf.
I get this error:
[W::vcf_parse] Contig 'scaffold_45' is not defined in the header. (Quick workaround: index the file with tabix.) Encountered error, cannot proceed. Please check the error output above.
Therefore I would need to define contigs in the header (that is what the error says). To do so, I tried what is explain in the last message in this post: bcftools_Issue#766 Following exactly what it is said there I got a vcf file that looks like this:
##contig=<ID=scaffold_45, length=120> ##contig=<ID=scaffold_62, length=125> ##contig=<ID=scaffold_69, length=132> ##contig=<ID=scaffold_98, length=172> ##contig=<ID=scaffold_154, length=149> ##contig=<ID=scaffold_210, length=184> (there are more rows) **VIM - Vi IMproved 8.0 (2016 Sep 12, compiled Jun 21 2019 04:10:35)** scaffold_45 10 0 A C 0 PASS
Which is wrong. The header that was there has been replaced by the list of contigs IDs and lengths (which is what I want to include in the header but without losing the header that was there). There is also one extra row (VIM - Vi IMproved 8.0 (2016 Sep 12, compiled Jun 21 2019 04:10:35) that I don't know how it appeared and should not be there.
In short, I need to be able to edit the IDs of the vcf file, and for that I need to add the contig info in the header of the vcf file. The reason I want the IDs in the vcf file is to filter SNPs by IDs with vcftools. Does someone know how to solve this issue?
Thanks in advance,
PS: I may tray to include the header of my initial vcf file into the header.txt and see what happens...