Validate the VCF file by freebayes
0
0
Entering edit mode
5.4 years ago
dorarinyo88 ▴ 20

Hello, I had run two types of variant calling using Freebayes and Mpileup with several individuals. While calling the variants using Freebayes, one of the individuals already successfully converted into VCF file which is larger file size compare to the others. So, how am I going to validate it whether it is acceptable or not. Can I compare using the Mpileup VCF files? Thanks.

SNP snp next-gen • 2.1k views
ADD COMMENT
2
Entering edit mode

While calling the variants using Freebayes, one of the individuals already successfully converted into VCF file

This is not very specific. You do not "convert an individual to a vcf file". This sounds like you have a big machine in the lab in which you put a complete human and on the other side a variant file is printed out after. Rather, you should write that "have performed alignment (using bwa? be specific), and have performed variant calling using this and that command (be as complete as possible!).

which is larger file size compare to the others.

File size is a bad metric to compare the content. You could count lines, to get a better idea of the number of variants. And if you insist on using file size (bad idea, again) you should at least tell us the file sizes. We are very bad at reading your mind or what's on your screen.

Please make things as easy as possible for us to help you as fast as possible :-)

ADD REPLY
1
Entering edit mode

Hello dorarinyo88 ,

with "Mpileup" you mean the output of samtools mpileupor bcftools mpileup? This would not be your final variant list. For this there is a bcftools call step neccessary.

With the default parameters freebayes outputs a lot of clear false positiv variants. You can savely remove all variants with a QUAL value below 1 before comparing.

For validation you always need something to compare to, to which you trust. If you have good experience with another variant caller, you can take its result as comparison of course. I great tool for comparison I've found some weeks ago is hap.py.

fin swimmer

ADD REPLY
0
Entering edit mode

Hello there,

Thanks for the reply.

The "Mpileup" means the output of samtools mpileup. I'm positive with the result produce by samtools pileup. I just wanted to know that the file produce by Freebayes VCF file is more likely to be similar with the samtools mpileup VCF file. Is there anyway to validate it? I will try with the tools you have proposed. Thank you so much.

ADD REPLY
2
Entering edit mode

Hello again,

I doesn't make sense to compare the output of samtools mpileup and freebayes. freebayes give you the final list of variants. Whereas mpileup is a collection of information about every covered position, which is used by bcftools call to decide which position should be investigate for a variant.

fin swimmer

ADD REPLY
0
Entering edit mode

Terminology and difficulties aside, I like to use

vt peek

to assess the number of calls and quality of my VCF files.

Also have a look at

vt compute_concordance

to compare call sets.

SnpSift is also great.

ADD REPLY

Login before adding your answer.

Traffic: 2946 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6