Plink: Retrieving specific SNP data for individuals in dataset
1
8
Entering edit mode
9.8 years ago
hessjl ▴ 90

I am new to plink and have a (hopefully) simple data management operation, but little idea of how to implement it. The essentials are listed here:

  1. I have three genotype call files: data.bed, data.fam, and data.bim.
  2. I created a file snps.txt containing a list of markers.
  3. My goal is to create a file with subject IDs listed in rows and SNP IDs (as per snps.txt) as column headers with genotypes listed underneath.

Question: How might I go about performing this task?

plink • 20k views
ADD COMMENT
16
Entering edit mode
9.8 years ago

If you're fine with the main body of the file containing allele counts (0/1/2) rather than allele names, you can use

plink --bfile data --extract snps.txt --recodeA

and then use Unix cut to remove any header columns you don't want.

Otherwise,

plink --bfile data --extract snps.txt --recode compound-genotypes

(this requires PLINK 1.9) will almost get you there, but you'll need to add a header line on your own. Or

plink --bfile data --extract snps.txt --recode

works if you want two columns per SNP.

ADD COMMENT
0
Entering edit mode

Thank you for the clear explanation chrchang523! The last command you provided worked like a charm.

ADD REPLY

Login before adding your answer.

Traffic: 2572 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6