Hi everyone, I have a plink file and I want to get the distribution of calls (AA/Aa/aa) for each sample. How can I get to this? Any help would be appreciated.
Hi everyone, I have a plink file and I want to get the distribution of calls (AA/Aa/aa) for each sample. How can I get to this? Any help would be appreciated.
"plink --file ... --het" will give you the number of Aa calls for each sample.
If you also need AA/aa, you'd need to define which alleles are A and which ones are a. Once you have, you can merge in a synthetic sample with all aa calls, use --merge to merge that sample with your real dataset, and then run "--genome full". Then look at the lines of the .genome file which include your artificial all-aa sample; the IBS0 and IBS2 columns of those lines will give you the additional counts you need.
Thanks for your comments. Yes, I need AA/Aa/aa. --het will return a text file with the following columns: 1. FamilyID, 2.Within-family ID, 3. Observed number of homozygotes, 4. Expected number of homozygotes, 5. Number of non-missing autosomal genotypes, 6. Method-of-moments F coefficient estimate. No information about heterozygotes (Aa)! Also, how can I perform what you explained in the second part of your comments? Any scripts in R? Thanks
Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
could you please post an example of your file?
fin swimmer
Hi, Plink files (my input files) are .map and .ped files:
.ped file is as follows:
A text file with the following fields:
.map file:
A text file with no header file, and one line per variant with the following 3-4 fields: