Question: Validate the VCF file by freebayes
0
gravatar for dorarinyo88
12 weeks ago by
dorarinyo8820
dorarinyo8820 wrote:

Hello, I had run two types of variant calling using Freebayes and Mpileup with several individuals. While calling the variants using Freebayes, one of the individuals already successfully converted into VCF file which is larger file size compare to the others. So, how am I going to validate it whether it is acceptable or not. Can I compare using the Mpileup VCF files? Thanks.

snp next-gen • 208 views
ADD COMMENTlink modified 4 weeks ago by Biostar ♦♦ 20 • written 12 weeks ago by dorarinyo8820
2

While calling the variants using Freebayes, one of the individuals already successfully converted into VCF file

This is not very specific. You do not "convert an individual to a vcf file". This sounds like you have a big machine in the lab in which you put a complete human and on the other side a variant file is printed out after. Rather, you should write that "have performed alignment (using bwa? be specific), and have performed variant calling using this and that command (be as complete as possible!).

which is larger file size compare to the others.

File size is a bad metric to compare the content. You could count lines, to get a better idea of the number of variants. And if you insist on using file size (bad idea, again) you should at least tell us the file sizes. We are very bad at reading your mind or what's on your screen.

Please make things as easy as possible for us to help you as fast as possible :-)

ADD REPLYlink written 12 weeks ago by WouterDeCoster36k
1

Hello dorarinyo88 ,

with "Mpileup" you mean the output of samtools mpileupor bcftools mpileup? This would not be your final variant list. For this there is a bcftools call step neccessary.

With the default parameters freebayes outputs a lot of clear false positiv variants. You can savely remove all variants with a QUAL value below 1 before comparing.

For validation you always need something to compare to, to which you trust. If you have good experience with another variant caller, you can take its result as comparison of course. I great tool for comparison I've found some weeks ago is hap.py.

fin swimmer

ADD REPLYlink written 12 weeks ago by finswimmer9.9k

Hello there,

Thanks for the reply.

The "Mpileup" means the output of samtools mpileup. I'm positive with the result produce by samtools pileup. I just wanted to know that the file produce by Freebayes VCF file is more likely to be similar with the samtools mpileup VCF file. Is there anyway to validate it? I will try with the tools you have proposed. Thank you so much.

ADD REPLYlink written 12 weeks ago by dorarinyo8820
2

Hello again,

I doesn't make sense to compare the output of samtools mpileup and freebayes. freebayes give you the final list of variants. Whereas mpileup is a collection of information about every covered position, which is used by bcftools call to decide which position should be investigate for a variant.

fin swimmer

ADD REPLYlink written 12 weeks ago by finswimmer9.9k

Terminology and difficulties aside, I like to use

vt peek

to assess the number of calls and quality of my VCF files.

Also have a look at

vt compute_concordance

to compare call sets.

SnpSift is also great.

ADD REPLYlink written 29 days ago by colindaven1.0k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 2235 users visited in the last hour