Question: Plink - Non Human Data
gravatar for Zev.Kronenberg
8.8 years ago by
United States
Zev.Kronenberg11k wrote:

I have read that PLINK supports some non human organisms: mouse, dog, ect...

I, however, am not lucky enough to have a "model" organism. Is there anyway to force PLINK to take scaffolds as opposed to chromosomes ( a limited number at that)? I have several hundred.

population format plink • 3.9k views
ADD COMMENTlink modified 8.8 years ago by Caddymob980 • written 8.8 years ago by Zev.Kronenberg11k
gravatar for iw9oel_ad
8.8 years ago by
iw9oel_ad6.1k wrote:

I wanted to know about Plink's support for different organisms too, but could not find the information in the documentation. Looking at the source code was quicker:

grep -E 'define.+Chromosome' helper.cpp

void defineHorseChromosomes()
void defineSheepChromosomes()
void defineRiceChromosomes()
void defineDogChromosomes()
void defineMouseChromosomes()
void defineCowChromosomes()
void defineHumanChromosomes()

So the answer to your question is no, unless you want to write some C++ to extend Plink. Looking into these functions, I found code for setting up the chromsomes in a 'par' (options) object.

To find where these are called from:

grep defineDogChromosomes *

plink.cpp:  if (par::species_dog) defineDogChromosomes();

In plink.cpp:

if (par::species_dog) defineDogChromosomes();
  else if (par::species_sheep) defineSheepChromosomes();
  else if (par::species_cow) defineCowChromosomes();
  else if (par::species_horse) defineHorseChromosomes();
  else if (par::species_rice) defineRiceChromosomes();
  else if (par::species_mouse) defineMouseChromosomes();
  else defineHumanChromosomes();

You may be able to get away with writing one extra function, but I would also look at the code for the specific Plink analysis functions to see how your new scaffold definitions would be used.

ADD COMMENTlink modified 8.8 years ago • written 8.8 years ago by iw9oel_ad6.1k

Very pertinent point made at the end. The dog genome has much longer stretches of high LD/haplotypes than human, for example. Such would need to be taken into account.

ADD REPLYlink written 8.8 years ago by Larry_Parnell16k
gravatar for Caddymob
8.8 years ago by
United States
Caddymob980 wrote:

I've run across this problem too...

My solution was to simply code all the chromosomes as 1 and positions as 1,2,3,4, etc. just to get through PLINK. Since the SNP IDs should be unique and thus map to a specific chromosome/scaffold/contig and position, you can create a lookup table (I do this in R) and map the SNPs back to their correct coordinate.

Unfortunately this will kill any LD type stuff in PLINK, but if you are only doing single SNP analyses, this works for me.

ADD COMMENTlink written 8.8 years ago by Caddymob980

I also took this approach although I was bummed because like you mentioned it destroyed subsequent analyses.

ADD REPLYlink written 8.8 years ago by Zev.Kronenberg11k

I was thinking about this too. If say LD analyses were what you were after, what about binning your contigs into 22 chromosomes, but within each chromosome, separate each by a big amount, say 5MB, so you don't get spurious LD results. I haven't tested this but perhaps it would work. I don'think PLINK has any hard limits on chromosome length, so even if you are 1e10 bp in pseudo chr length this might work. Just an idea...

ADD REPLYlink written 8.8 years ago by Caddymob980
Please log in to add an answer.


Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1145 users visited in the last hour