How to write a Python script to edit a .vcf file?
Entering edit mode
2 days ago
tidalArms • 0

I am trying to update an older .vcf file with some new information so that it reflects the changes that have been implemented in the . The sample IDs have changed, as well as some of the formatting of the GT, DS, and GP information (e.g. changing forward slashes to pipe symbols). I have been researching how best to go about this process using the Python packages of PyVCF, but it's not entirely clear from their docs ( how one can do this. I have tried to use PyVCF Writer object that would copy the template of the new .vcf file (i.e. its metadata and format), and then I wanted to make a for loop that would iterate over each record in the old .vcf and then change each of the sample names (based on a pre-existing dict), as well as modify the content of the INFO, FORMAT, and sample result sections.

However, it does not seem that PyVCF has any tools to easily do this. So I found another library called VCFPy (, but it also does not seem that it has any clearcut tools to do this easily.

With both packages, I wanted to iterate over the old .vcf file (as a reader object), copy each sample and variant, and modify each respectively. So my code would kind of look like this below:

old_vcf_reader = vcf.Reader(filename='vcf/test/tb.vcf.gz')
new_vcf_writer = vcf.Writer(open('/dev/null', 'w'), vcf_reader)
for record in old_vcf_reader:
       #update sample names and modify GT formats

But does anybody know how I can readily update/modify content within each record in the above for loop easily?

vcfpy vcf python pyvcf • 115 views
Entering edit mode

Have you looked at the cyvcf2 documentation? That's my preferred module for working with VCFs.

Entering edit mode

I have not tried it yet. I will look into it now.


Login before adding your answer.

Traffic: 1865 users visited in the last hour
Help About
Access RSS

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6