8.8 years ago

kindlychung
60

plink bed files uses 2 bits for each genotype data point, which is most efficient in terms of storage space, but not so convenient for numerical analysis. I am wondering how does plink read these bits into a matrix of int/double for linear algebra operations?

Have you gone through the source code?

I am trying to, but I am not that good at C. Could you tell what functions are involved so that I can focus on these?

I'd search the source code for

`fopen`

and`fread`

, since those will give you a clue where to start.