convert VCF to gVCF
1
0
Entering edit mode
22 months ago
mateid • 0

Hello everybody,

I'm sorry if this has been answered before, but it's a simple question that I'm sure for someone more experienced would be very simple to address. I've recently received the results from my full genome sequencing in VCF format (SNP and INDEL). I would like to recreate a full genome VCF from these files (no BAM). I have the exact reference fasta used for generating the VCFs, but somehow (even after a lot of searching and experimentation) I'm still unable to get any closer to my goal. So far I've only managed to merge the two VCFs and apply the resulting file to the reference fasta, which gave me a new fasta file.

Is there any way of converting this file (or other that can be obtained from VCF) to gVCF? My goal is to extract all SNPs, not just the variations from reference.

Thank you!

format conversion • 1.6k views
ADD COMMENT
0
Entering edit mode

My goal is to extract all SNPs, not just the variations from reference.

What do you mean by that? How were the SNPs determined in the first place if not by comparing your sequencing results to the reference genome?

ADD REPLY
0
Entering edit mode

The SNPs were indeed determined the way you said, but the VCF files contain only variations from the reference genome. I would like to generate a gVCF, which contains SNPs from the entire genome.

ADD REPLY
0
Entering edit mode

Variants are between your genome and the reference genome. That is what variant means.

ADD REPLY
0
Entering edit mode

Thank you! Could you please elaborate a bit on the "counts"? I understand that in practice not everything that isn't a variant is necessarily equal to the reference genome, but if I were to make this assumption wouldn't it be (at list in theory) possible to recreate the entire genome from just the reference file and the VCFs?

ADD REPLY
2
Entering edit mode
22 months ago

You cannot convert a VCF to gVCF. For generating a gVCF you need a BAM file.

The difference between a VCF and a gVCF is that on the gVCF you have the counts for each base that are equal to the reference genome for all the sites where you don't have a variant. You cannot get that without having the BAM file.

Also you don't want a SNP you want all the positions where there was no variant detected.

ADD COMMENT
0
Entering edit mode

ok, so if I have a .vcf and a .bam file, can I convert it to gvcf? A tool I use to generate vcf from bs-seq data doesn't have an option to generate gvcf. But maybe some external tool exists to make it?

ADD REPLY
0
Entering edit mode

for a one sample vcf and one bam, yes in theory you could do this but it would be easier to just improve your bs-seq tool

ADD REPLY

Login before adding your answer.

Traffic: 1937 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6