Question: Convert Binary Plink Files To Human Readable
0
gravatar for marlena.siwiak
5.5 years ago by
marlena.siwiak0 wrote:

Hi!

I have received three plink files: *bed, *bim, and *fam, from which I need to extract two columns of data in a human readable form. The files contain some information on patients' genotypes and survival times.

I have managed to extract the genotype information by creating a *ped file (under plink 1.07-3 in Debian, Linux):

p-link --bfile mar2 --recode --tab --out outfile

The resulting file contains the following columns (each row is a patient):

family_ID sample_ID paternal_ID maternal_ID sex affection genotype

But there is no information on patients' survival times. Does anybody know how to extract them?

I don't know neither its column name, nor the way this information is encoded in the binaries. Probably I need to list all column names first and guess which one is the right one, but I don't know how to do it. Or maybe there exist some standard way to encode survival times in plink?

I would be grateful for a detailed instruction, as I am not a regular plink user (and don't really want to be - I just need these two columns of data).

Thanks for help.

plink • 2.6k views
ADD COMMENTlink modified 5.5 years ago • written 5.5 years ago by marlena.siwiak0
2
gravatar for zx8754
5.5 years ago by
zx87547.3k
London
zx87547.3k wrote:

PED files:

A PED file must have 1 and only 1 phenotype in the sixth column.

"affection" is your 6th column, which can be either a quantitative trait or an affection status. You might be missing Alternate phenotype files or Covariate files for your data, apart from binary files.

ADD COMMENTlink written 5.5 years ago by zx87547.3k
0
gravatar for marlena.siwiak
5.5 years ago by
marlena.siwiak0 wrote:

In the created ped file I have 7 columns separated by tabs:

idXXX    idXXX     0    0    1    2    T T
idYYY   idYYY    0    0    1    2    C T
...

The nucleotides in the last column are separated by space. I guessed that the sixth column is "affection" based on the manual. Affection or not, it is certainly not the survival time.

You say I'm missing some more files... I cannot exclude that, however, I was assured few times by the research authors that survival times are there - in the binaries. Any ideas how to take them out, or at least check if they are really there?

ADD COMMENTlink modified 5.5 years ago • written 5.5 years ago by marlena.siwiak0
1

In your binaries files, the phenotype is the 6th column of your fam file. If this column is not what you are looking for, then you don't have what you want.

ADD REPLYlink written 5.5 years ago by Maxime Lamontagne2.1k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 949 users visited in the last hour