How does plink read bed files?
1
0
Entering edit mode
9.2 years ago
kindlychung ▴ 60

plink bed files uses 2 bits for each genotype data point, which is most efficient in terms of storage space, but not so convenient for numerical analysis. I am wondering how does plink read these bits into a matrix of int/double for linear algebra operations?

plink • 3.6k views
ADD COMMENT
0
Entering edit mode

Have you gone through the source code?

ADD REPLY
0
Entering edit mode

I am trying to, but I am not that good at C. Could you tell what functions are involved so that I can focus on these?

ADD REPLY
0
Entering edit mode

I'd search the source code for fopen and fread, since those will give you a clue where to start.

ADD REPLY
1
Entering edit mode
9.2 years ago

The short answer is that it usually doesn't. Whenever possible, PLINK 1.9 never unpacks the data; instead it uses bitwise operations and population count to perform computation directly on the 2-bit representation.

ADD COMMENT

Login before adding your answer.

Traffic: 1792 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6