Question: using snprelate on 23andme dataset
3.9 years ago
abims10 wrote:

I will use 23andme raw data in snprelate, my data is like that, and there 500 individuals:

# rsid    chromosome    position    genotype
rs4477212    1    82154    AA
rs3094315    1    752566    AA
rs3131972    1    752721    GG

But i need to first convert it to gds format.

I should denote,, snp.position etc. and also create genotype

add.gdsn(newfile, "",
add.gdsn(newfile, "",
add.gdsn(newfile, "snp.position", snp.position)
add.gdsn(newfile, "snp.allele", c("A/G", "T/C", ...))



var.geno <- add.gdsn(newfile, "genotype",,, storage="bit2")


What I understand is is the vector of all the user ids, is the vector of all snps and so on. So, in genotype part how would i indicate that user x's snp id y is AA ? What kind of a matrix is it?

My second question is how should I compute reference alleles, should I compute it on my 500 people population or should I check them from somewhere else, if its where do you suggest?

Thank you so much.



3.9 years ago
Sydney, Australia
Neilfws48k wrote:

First part of question: to convert to GDS, I would try first converting the 23andme data to VCF. A few tools claim to do this; the best I've found is here.

Then you can try snpgdsVCF2GDS() in the SNPRelate package to convert VCF to GDS.

