I have a GenomeStudio genotype file with missing genotypes denoted by "-"
Using this file I generated, for each chromosome the map, fam and lgen files and using the --recode option in plink converted them to ped format. To overcome the plink "Error: Locus has >2 alleles" I used the --missing-genotype option with the "-"
After ped files for each chromosome were successfully generated, there are a couple issues am facing:
My lgen file corresponds to the map file - but after recode the ped file has way more columns than the rows. I excpect the number of columns to be rows x 2 (both alleles) that of the map file.
When I try to merge all the chromosomes for evaluating summary statistics the "-" in the data doesn't seem to be excluded and continue to give errors.
Would convert all the "-" to 0 is the solution here? Am trying to understand how to exclude such data and best practices.
Thanks for any suggestions/feedback.