Entering edit mode
5.4 years ago
seta
★
1.9k
Dear all,
I have got multiple single sample vcf files that should be merged into multi-sample vcf file. For the VCF normalization step before doing merging, could you please let me know which type of human reference genome should be used? I found here at Ensembl, but have no idea which type should be downloaded. As these vcf files resulted from whole genome sequencing and bwa aligner used for mapping step, I guess the toplevel assembly should be used, is it right? please kindly share me your suggestion and idea.
Thanks
why would you need some kind of different reference for merging the vcfs ?
I didn't involved in the mapping step and have not access to the responsible person; I received just vcf files.
how will you merge the vcf ? a tool like "bcftools merge" doesn't need a reference while "gatk combinevariants" needs it .
I'm going to use bcftools and as far as I know it requires reference genome for variant normalization that should be used before merging, isn't it? please let me know your idea.
ahh, ok , you were talking about this kind of normalization.
ahh, ok , you were talking about this kind of normalization.
yeah. Could you please let me know which type of reference genome should be used in this situation? I found here from Ensembl, which type should I download?
you have to check in your VCF header the
##contig
lines and check they are the same chromosome names and the same length than the REF