SHAPEIT v4 modify keep QUAL FORMAT and INFO field
0
0
Entering edit mode
9 weeks ago
nhaus ▴ 60

Hello,

I am using shapeit v4 to phase my germline mutation calls which I got using GATKs HaplotypeCaller (WGS data). As a reference, I am using the 1000k genome project. I think that everything works well, since the program runs without an error and my samples are phased afterwards, but the issue is that the resulting phased vcf file loses almost all INFO FORMAT and QUAL entries. E.g. for the format, only GT remains. So my question is, if you know whether it is possible to simply "add" the phased Genotypes and keep all other entries of my VCF.

Furthermore, I noticed, that shapeit v4 only retains variants that are shared between the input (my vcf file) and the reference (the 1000k genomes reference vcf file). Do you now if there is an option to keep all variants, and simply ignore the ones that are missing in the reference while phasing?

Any help is much appreciated!

shapeit phasing haplotype • 218 views
0
Entering edit mode

You will really have to show some of the commands that you are using, in particular, those that are used to convert the IMPUTE format to VCF. Be wary of default parameter values.

1
Entering edit mode

Kevin Blighe shapeit4 doesn't use the IMPUTE format - it uses bcf natively. nhaus For the first question - I think the answer is it is not possible to retain this information. You could probably use bcftools annotate to pull out the annotations you want from the prephased bcf and then add them to the post-phased bcf. I think the answer to your second question is also no (not 100% sure though). Again, you could take the excluded variants from the pre-phased bcf and merge them back in with the post-phased bcf. But it's a bit of a pain to do this maybe.

0
Entering edit mode

Getting mixed up by answering too many questions.

1
Entering edit mode

blame the people who make so many file formats :)

0
Entering edit mode

Thank you for your answer. I think merging in the excluded variants is quite "easily" possible, because the authors said how the decide which variants they exclude. So getting these should be simple, and the merge command is also straight forward. Feels like unnecessary runtime but what can you do.

I didnt know of bcftools annotate I will look into it. Thank you very much!