Question: Is It Possible To Subset Dosage Data With Plink?
gravatar for chancerh
7.4 years ago by
chancerh30 wrote:


I am trying to use PLINK (v1.07) to input dosage data and output a subset of that data without analyzing it. That is, I want to keep all of the SNPs but only a subset of the samples. Is this possible with PLINK? I have been using the --dosage and --write-dosage options, but I have not had any success. Details Below.

I have imputed dosage data for 5 samples in a file called test.dose that looks like this:

SNP A1 A2    F1 D1-40 F2 D2-32 F3 D3-30 F4 D4-49 F5 D5-30
7:16719 A B 1.99800005159341 1.99599998397753 1.99800005159341 1.99800005159341 1.99599998397753
7:31273 A B 1.55099993944168 1.92400002479553 1.91199994832277 1.9119999781251 1.94999995082617

I have a test.fam file that looks like this:

F1 D1-40 -9 -9 2 -9
F2 D2-32 -9 -9 2 -9
F3 D3-30 -9 -9 2 -9
F4 D4-49 -9 -9 2 -9
F5 D5-30 -9 -9 2 -9

I have a list.txt file that contains the following:

F2 D2-32
F5 D5-30

I am running the following PLINK command:

plink --dosage test.dose format=1 --fam test.fam --keep list.txt --noweb --write-dosage --out subset

This outputs a file called subset.out.dosage, but it doesn't contain the dosages. It looks like this:

7:16719 A B
7:31273 A B

What I would like is the above file but with dosages for the samples contained in list.txt. I realize that there are many tools for manipulating text, but is this possible with PLINK?

plink • 3.2k views
ADD COMMENTlink written 7.4 years ago by chancerh30

Didn't test it, but GenGen might do it, see file1 -keep caseid.keep -prefix caseonly. Why do you need to subset?

ADD REPLYlink written 7.4 years ago by zx87549.9k
Please log in to add an answer.


Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1650 users visited in the last hour