Question: Extract Samples With Specific Rsid And Genotype Using Plink Or Similar Tools
gravatar for Khader Shameer
7.9 years ago by
Manhattan, NY
Khader Shameer18k wrote:

I have a PLINK formatted database (bed/bim/fam files) and corresponding recoded files (hh/ped/map). I am looking for an effective way to extract samples from this database with specific genotypes. I have looked through the PLINK manual and found that I can extract set of samples using "--keep" parameter and extract set of genotypes using "--extract", am wondering if this can be done in a single step using another parameter or tool.

My input is a list of rsIDs and genotypes; I need to get sample ids and genotype as output. INPUT

rs1800562 AA


Sample1 AA
Sample5 AA
Sample22 AA

Is there any option in PLINK to do this or I should use unix 'grep' and/or a custom script to extract data. Suggestions on other computational genomics tools to do similar task is also welcome.

gwas plink genomics genotyping • 4.9k views
ADD COMMENTlink modified 6.9 years ago by Biostar ♦♦ 20 • written 7.9 years ago by Khader Shameer18k
gravatar for Stephen
7.9 years ago by
Charlottesville Virginia
Stephen2.7k wrote:

You can combine both --keep and --extract in a single step, but you're wanting to condition your --keep based on the genotypes you get from your --extract, which PLINK can't do to my knowledge. If you want a single ped file for each snp you could do something like

awk '{print $1}' INPUT > mysnps
plink --bfile data --extract mysnps --tfile mysnps
(some code here to loop through each line of mysnps.tped and pulling out column index when your genotype matches, and write out a list of samples for each snp)
(some code here to run plink --keep for each list of samples)

... but you probably already knew this, and just need an implementation. Sorry this wasn't much help.

ADD COMMENTlink written 7.9 years ago by Stephen2.7k

Thanks a lot Stephen. I worked out a solution based on your suggestion - tped was the hat-tip :). Please see if you can add this as an answer for future reference.

ADD REPLYlink written 7.9 years ago by Khader Shameer18k
Please log in to add an answer.


Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1411 users visited in the last hour