I have been carrying out variant calling after alignment using bcftools and extracting a consensus fasta file for small viral genomes for downstream analysis using
bcftools consensus -f ref.fa test.vcf > consensus.fa
I want more control over the variant calling parameters but can't figure out a way to do this using bcftools call from the manual. I have found that VarScan offers more options for minimum coverage, minimum reads, minimum quality etc... but I then can't find a way to extract a consensus fasta file from the output file.
I have two questions
- Please could anyone help me to extract a consensus fasta file from a VarScan pileup2cns output OR
- Please could anyone help me to find out how I can set parameters such as minimum variant coverage, minimum variant frequency using the bcftools call function?
Thanks
Hi Kevin Thanks for your response. I have looked at bcftools filter in the manual before but if I am completely honest I don't really understand it - I'm new to command line so its quite overwhelming. Do I need to use expressions? Say if I use -i QUAL>20 will that only give me those variants with quality scores above 20?
I'll go back over the manual again and try to make sense of it but if you know of any tutorials for this function which may help I'd be extremely grateful Thanks
Yes, for example, this will filter include those variants with QUAL > 20 and position read-depth > 20:
There are examples of expressions at the bottom of the manual page, but I admit that it's a lot to digest for those just getting into this: http://samtools.github.io/bcftools/bcftools.html#expressions
For every VCF that you have, you should know what is the data that is encoded in the VCF. You can know this by looking at the information in the header. Common tags are AD, DP, AF, et cetera.
Hi Kevin Thanks so much for this, its really helpful. I will have a go back through the manual with this information and see what I can achieve.