So I have about 150 of these VCF files and I forgot to parse the reference before running all 150. For downstream analysis with snpEff, I need to have the chromsome ID only contain JTAI01000001 -> JTAI01000053 in that collumn, not all that other junk. Does anyone have a way in which I could potentially substitute out everything but the middle JTAI01000001 part of these GVCF's so I can proceed with my analysis.
##contig=<ID=ENA|JTAI01000001|JTAI01000001.1,length=360176>
##contig=<ID=ENA|JTAI01000002|JTAI01000002.1,length=959544>
##contig=<ID=ENA|JTAI01000003|JTAI01000003.1,length=208220>
##contig=<ID=ENA|JTAI01000004|JTAI01000004.1,length=470636>
##contig=<ID=ENA|JTAI01000005|JTAI01000005.1,length=225370>
##contig=<ID=ENA|JTAI01000006|JTAI01000006.1,length=364413>
##contig=<ID=ENA|JTAI01000007|JTAI01000007.1,length=1279890>
##contig=<ID=ENA|JTAI01000008|JTAI01000008.1,length=18993>
##contig=<ID=ENA|JTAI01000009|JTAI01000009.1,length=291696>
##contig=<ID=ENA|JTAI01000010|JTAI01000010.1,length=821>
##contig=<ID=ENA|JTAI01000011|JTAI01000011.1,length=128648>
##contig=<ID=ENA|JTAI01000012|JTAI01000012.1,length=66483>
##contig=<ID=ENA|JTAI01000013|JTAI01000013.1,length=592675>
##contig=<ID=ENA|JTAI01000014|JTAI01000014.1,length=1554>
##contig=<ID=ENA|JTAI01000015|JTAI01000015.1,length=3499>
##contig=<ID=ENA|JTAI01000016|JTAI01000016.1,length=5436>
##contig=<ID=ENA|JTAI01000017|JTAI01000017.1,length=1198>
##contig=<ID=ENA|JTAI01000018|JTAI01000018.1,length=6108>
##contig=<ID=ENA|JTAI01000019|JTAI01000019.1,length=9709>
##contig=<ID=ENA|JTAI01000020|JTAI01000020.1,length=523589>
##contig=<ID=ENA|JTAI01000021|JTAI01000021.1,length=97817>
##contig=<ID=ENA|JTAI01000022|JTAI01000022.1,length=268453>
##contig=<ID=ENA|JTAI01000023|JTAI01000023.1,length=215216>
##contig=<ID=ENA|JTAI01000024|JTAI01000024.1,length=79716>
##contig=<ID=ENA|JTAI01000025|JTAI01000025.1,length=121647>
##contig=<ID=ENA|JTAI01000026|JTAI01000026.1,length=31279>
##contig=<ID=ENA|JTAI01000027|JTAI01000027.1,length=3130>
##contig=<ID=ENA|JTAI01000028|JTAI01000028.1,length=340737>
##contig=<ID=ENA|JTAI01000029|JTAI01000029.1,length=5801>
##contig=<ID=ENA|JTAI01000030|JTAI01000030.1,length=4981>
##contig=<ID=ENA|JTAI01000031|JTAI01000031.1,length=318753>
##contig=<ID=ENA|JTAI01000032|JTAI01000032.1,length=45350>
##contig=<ID=ENA|JTAI01000033|JTAI01000033.1,length=114418>
##contig=<ID=ENA|JTAI01000034|JTAI01000034.1,length=1682>
##contig=<ID=ENA|JTAI01000035|JTAI01000035.1,length=28211>
##contig=<ID=ENA|JTAI01000036|JTAI01000036.1,length=117188>
##contig=<ID=ENA|JTAI01000037|JTAI01000037.1,length=188157>
##contig=<ID=ENA|JTAI01000038|JTAI01000038.1,length=3440>
##contig=<ID=ENA|JTAI01000039|JTAI01000039.1,length=373676>
##contig=<ID=ENA|JTAI01000040|JTAI01000040.1,length=996>
##contig=<ID=ENA|JTAI01000041|JTAI01000041.1,length=618>
##contig=<ID=ENA|JTAI01000042|JTAI01000042.1,length=211284>
##contig=<ID=ENA|JTAI01000043|JTAI01000043.1,length=87165>
##contig=<ID=ENA|JTAI01000044|JTAI01000044.1,length=873289>
##contig=<ID=ENA|JTAI01000045|JTAI01000045.1,length=795>
##contig=<ID=ENA|JTAI01000046|JTAI01000046.1,length=590>
##contig=<ID=ENA|JTAI01000047|JTAI01000047.1,length=705>
##contig=<ID=ENA|JTAI01000048|JTAI01000048.1,length=1262>
##contig=<ID=ENA|JTAI01000049|JTAI01000049.1,length=1307>
##contig=<ID=ENA|JTAI01000050|JTAI01000050.1,length=766>
##contig=<ID=ENA|JTAI01000051|JTAI01000051.1,length=795>
##contig=<ID=ENA|JTAI01000052|JTAI01000052.1,length=724>
##contig=<ID=ENA|JTAI01000053|JTAI01000053.1,length=619>
##reference=file:///scratch/gwc32007/crypto_genomes/30976_hominis_genome.fasta
#CHROM POS ID REF ALT QUAL FILTER INFO FORMAT ERR1305010
ENA|JTAI01000001|JTAI01000001.1 1 . A <NON_REF> . . END=2 GT:DP:GQ:MIN_DP:PL
0:7:0:7:0,0
ENA|JTAI01000001|JTAI01000001.1 3 . A <NON_REF> . . END=3 GT:DP:GQ:MIN_DP:PL
0:7:99:7:0,298
ENA|JTAI01000001|JTAI01000001.1 4 . C <NON_REF> . . END=8 GT:DP:GQ:MIN_DP:PL
0:7:0:7:0,0
ENA|JTAI01000001|JTAI01000001.1 9 . A <NON_REF> . . END=9 GT:DP:GQ:MIN_DP:PL
0:7:99:7:0,300
ENA|JTAI01000001|JTAI01000001.1 10 . C <NON_REF> . . END=11 GT:DP:GQ:MIN_DP:PL
0:7:0:7:0,0
ENA|JTAI01000001|JTAI01000001.1 12 . C <NON_REF> . . END=12 GT:DP:GQ:MIN_DP:PL
0:7:99:7:0,284
ENA|JTAI01000001|JTAI01000001.1 13 . T <NON_REF> . . END=14 GT:DP:GQ:MIN_DP:PL
0:7:0:7:0,0
ENA|JTAI01000001|JTAI01000001.1 15 . A <NON_REF> . . END=18 GT:DP:GQ:MIN_DP:PL
0:7:99:7:0,276
Pierre,
Which version do you see the
from\tto
description? The manual (both of them actually) says white-space separated.That's odd - the manual on htslib.org still says white space separated, but when I run
bcftools annotate
on my local machine, I see thefrom\tto
. Could it be that the online manual is not being maintained properly?