Inserting Variant allele frequency to VCF file
2
0
Entering edit mode
3.8 years ago

I calculated the variant allele frequency (vaf) for the Strelka variants using Stelka parser. Now, I want to add that vaf into my vcf file. Is there a way to do that? Currently the vaf is saved as a data frame column in R.

Your help is appreciated. Thanks!

variants Strelka VCF VAF NGS • 3.2k views
ADD COMMENT
1
Entering edit mode

@parvathi.sudha can you please provide an example of how to use the Strelka parser to calculate vaf? I'm trying to do this myself but the git link only provides the helper functions. Thanks!

ADD REPLY
0
Entering edit mode
3.8 years ago
Ram 43k

You can export the data to a tab separated file and use bcftools annotate to add an INFO field to the VCF file.

Make sure you have CHROM, POS, REF and ALT in the exported file and you use bcftools annotate -c none. It is also best if you decompose multi-allelic variants to bi-allelics using bcftools norm (or better, vt decompose) before attempting this operation.

ADD COMMENT
0
Entering edit mode

Hi, Thank you for your reply. When using bcftools annotate, do I have to give all the columns in the vcf as tab separated values? Or just the vaf column? The vcf file has CHROM, POS, ID, REF, ALT, QUAL, FILTER, INFO, FORMAT, NORMAL, TUMOR columns. I couldn't append vaf to vcf file.

Thank you for the help.

ADD REPLY
0
Entering edit mode

You need to export CHROM, POS, REF, ALT and VAF (if VAF is variant-level INFO) as a tab-separated file. You can then annotate each site in the VCF with the tab separated file.

However, if the VAF is sample-specific information that would go in the FORMAT field (and in each sample's FORMAT), I'm not sure how bcftools annotate would handle that.


EDIT: It seems like you're adding an INFO field. That should be pretty straightforward with bcftools annotate.

ADD REPLY
0
Entering edit mode

I tried bcftools annotate, but that didn't work.

head annots.tab CHROM POS REF ALT AF chr1 118147675 A C 0.0617283950617284 chr12 25362777 A G 0.0746268656716418 chr2 141707882 G T 0.0508474576271186

cat annots.hdr

INFO=<id=af,number=1,type=float,description="allele frequency"="">

bcftools annotate -a annots.tab -h annots.hdr -c CHROM,POS,REF,ALT,AF Sample1_snvs.fpfilter_passed_vep_vaf.vcf

Error:

[E::vcf_hdr_read] No sample line Failed to open 27892_D-PL3668_CD138_S4_snvs.fpfilter_passed_vep_vaf.vcf: could not parse header

ADD REPLY
0
Entering edit mode
3.8 years ago

I'd be tempted to read your original VCF into R using the vcfR package, and then manually add your column to that class before writing it back out. Be sure to edit the metadata in the header too. You'll have to read the docs for the vcfR class to determine how best to do it, but it should be as easy as pasting your column to the INFO column in the fix matrix and adding the new field to the metadata matrix.

ADD COMMENT
0
Entering edit mode

Hi,

I tried this method too. I could add VAF to INFO column in the fix. But couldn't add new field to metadata. Also, I didn't quite understand how to convert the vcfR object back to .vcf file.

Thank you for the help.

ADD REPLY
0
Entering edit mode

Writing back to vcf format is as easy as: write.vcf(x, file = "", mask = FALSE, APPEND = FALSE) where x is your vcfR object. The metadata has to be added to the meta slot of the vcfR object - the reference manual can probably help you a bit there.

ADD REPLY
0
Entering edit mode

To extract and modify the vcf file, these are the steps I followed.

#vcfR object
vcf_file1 <- read.vcfR("sample1_snvs.fpfilter_passed_vep.vcf")
# To modify the vcfR object INFO
temp <- vcfR2tidy(vcf_file1)
# adding Variant allele frequency (calculated using Strelka parser) 
temp$fix$AF <- as.numeric(vaf$T_VAF)

temp$meta <- temp$meta %>% add_row(Tag = "INFO", ID = "AF", Number = "A", Type = "Float", Description="Allele Frequency")

class(temp) [1] "list"

temp contains S3 objects of fix, gt and meta (class which includes, tbl_df, tbl and data.frame)

For write.vcf(), we need vcfR object. But now the object temp not in vcfR format.

ADD REPLY
0
Entering edit mode

Ram made an excellent point above, I didn't even think about multiple samples, so be wary of that if using this method. It looks like vcfR still lets you parse multiple FORMAT fields, but I'm guessing it gets messy.

You may have to manually reassign the fix and meta slots to your original vcfR object to create one with your edited data. It doesn't look like there are convenient methods to generate one from scratch.

ADD REPLY
0
Entering edit mode

It makes sense that we did not immediately think about multiple samples - after all, annotations are site-specific, not sample specific. Sample specific annotations become relevant in MAF files, not in VCF files.

ADD REPLY

Login before adding your answer.

Traffic: 1659 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6