understand and how to plot the PLINK mdist file
1
0
Entering edit mode
2.4 years ago
raalsuwaidi ▴ 100

Hi all,

i really appreciate all the help as i am new to using PLINK.

i am running the below code to generate the distance matrix for a VCF file containing GWAS samples.

plink --vcf input.vcf.gz --distance 1-ibs flat-missing square --double-id --out plinkdistancematrix

the outpur files are

   plinkdistancematrix.mdist
   plinkdistancematrix.mdist.id

my first problem is that the files dont have a header, and the PLINK documentation does not explain them. so what do the files mean?

my second problem is how to plot the mdist file according to the different populations?

here is a part of the first line from the mdist file. you will see that it doesnt have the sample IDs

0   0.0692113   0.0677864   0.0685928   0.0688168   0.0683407   0.0678936   0.0698271

so what does the file mean and how do i plot it according to the populations?

plink mdist plot • 906 views
ADD COMMENT
2
Entering edit mode
2.4 years ago
raalsuwaidi ▴ 100

i found details regarding the file.

the values in the square matrix are the distance between the different samples. the order of the samples is as per the VCF file.

for example, a 3 sample file will look like this:

         sample1   sample2   sample3
sample1   0         0.6        0.03
sample2   0.6       0          0.4
sample3   0.03     0.4         0

hope this will be helpful

ADD COMMENT

Login before adding your answer.

Traffic: 1507 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6