How to merge multiple ped files without missing genotypes
4.9 years ago
kclaudio21 ▴ 10


I am working with NGS data that was converted from vcf to ped files individually. I want to merge all samples in a single ped file in order to perform other analyses using PLINK but when I do so, I get a lot of missing data for SNPs in positions where many individuals are homozygous for the reference alleles. Since the VCF files do not contain genotype information in loci where the individual is homozygous for the reference allele, when is merged to other files, PLINK assumes that they are missed genotypes (0/0). I thought that the command --merge-mode 5 was useful to avoid this problem when merging data but was not helpful. I considered merging all the data in vcf files to a single vcf and then perform the conversion to a ped file but I read that many people have the same problem as me even when they have all the data already merged in vcf.

Any suggestions?

P.S. I am not an expert on Bioinformatics, therefore I will appreciate any suggestion that does not involves using scripts.

merge vcf • 1.9k views

