PLINK IBD calculation for given samples against the rest of the data
1
0
Entering edit mode
6.2 years ago
gaow • 0

PLINK IBD calculation via --genome computes IBD/IBS for all pairs of samples in the dataset. Is there a way to list specific M samples that one wants to compute against the rest of data so that it's (N - M) * M pairs of results, rather than as many as (N - 1) * (N - 1) pairs?

SNP PLINK • 2.1k views
ADD COMMENT
0
Entering edit mode

I've added the PLINK tag, to help those watching this tag to find this question.

ADD REPLY
0
Entering edit mode
6.2 years ago

This isn't directly supported by plink 1.9. However, if M is large enough that it isn't reasonable to just perform the entire computation and then filter for the lines of interest, the following hack will help:

  1. Create a file (I'll call this id_order.txt) which has the M sample IDs of interest on the bottom, with the other (N-M) on top.
  2. Use "plink --bfile ... --indiv-sort f id_order.txt --make-bed reordered" to create a new fileset with the desired sample order.
  3. Run "plink --bfile reordered --genome --parallel k k --out ...", where k is the largest integer which isn't greater than N/(2M).
  4. The resulting .genome.[k] file will still have a few extra lines, so you may want to use e.g. a Python script to filter them out.
ADD COMMENT

Login before adding your answer.

Traffic: 2427 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6