The 1000 Genomes released great data. I am trying to understand why the parents NA12891 and NA12892 of the "famous" CEU trio are not in the phase3 1000 genomes panel (not even in the vcf file containing the 31 related individual set appart).
On the opposite their child NA12878 and the full YRI trio NA19238, NA19239, NA19240 are included in the phase 3.
To my understanding this is because the YRI trio have both low and high coverage data, which is not the case for the CEU parents (only high cov). Is it correct? And why such a choice?
Is it "safe" to merge the high coverage data for NA12891 and NA12892 to the phase3 vcf file?