OK, I have searched for this everywhere, and I just can't seem to even figure out if it is possible/meaningful to annotate (add tags and associated data) VCFs with external 'genotype' (
I know that
bcftools annotate and other tools can add
INFO tags and can EXCLUDE
FORMAT tags. However the information that I need to add does not make sense except when associated with a specific sample, while
INFO tags apply to the variant without regard to the sample in which it occurs.
For example,I am comparing multiple family-triplets (mother/father/affected-child). I would like to add a tag in the
FORMAT field that represents which triplet the individual belongs to. In addition, I would like to add information to each sample that indicates which 'mode of inheritance' the SNP appears to follow in each triplet.
This is information that is inherently tied to the sample and therefore ill-suited to the
INFO-type tag; however, I can not for the life of me find a tool that even mentions this. Am I missing some super-obvious reason that people don't ever need/want to be able to annotate VCFs in this fashion? Or is my google-fu simply too weak?
For your reference I will share what I have attempted using
bcftools annotate (all zipping and indexing of related files has been ommited here for brevity):
CHROM POS AGE_MO BAM_OK FAM_ID MOI 1 12921499 30 0 youdontknowme CmpHet 1 12921600 30 0 youdontknowme CmpHet 1 12939476 30 0 youdontknowme CmpHet 1 12939562 30 0 youdontknowme CmpHet 1 12939747 30 0 youdontknowme CmpHet 1 12942047 30 1 youdontknowme CmpHet 1 12942138 30 1 youdontknowme CmpHet 1 12942179 30 1 youdontknowme CmpHet ...
##FORMAT=<ID=AGE_MO,Number=1,Type=Float,Description="Age of associated proband in months."> ##FORMAT=<ID=FAM_ID,Number=1,Type=String,Description="Identification of family to which the individual belongs."> ##FORMAT=<ID=BAM_OK,Number=0,Type=Flag,Description="Manual inspection of the BAM file corroborates the MOI."> ##FORMAT=<ID=MOI,Number=1,Type=String,Description="Mode of Inheritance: HZR=recessive, DeNovo=de novo, XL=X-linked, CmpHet=Compound Het">
bcftools annotate -a annots.tab.gz -h annots.hdr -c CHROM,POS,AGE_MO,BAM_OK,FAM_ID,MOI data.vcf.bgz -Ou -o annotated.data.bcf
The tag "AGE_MO" is not defined in annots.tab.gz