using snprelate on 23andme dataset
1
1
Entering edit mode
9.0 years ago
abims ▴ 10

I will use 23andme raw data in snprelate, my data is like that, and there 500 individuals:

# rsid    chromosome    position    genotype
rs4477212    1    82154    AA
rs3094315    1    752566    AA
rs3131972    1    752721    GG

But I need to first convert it to gds format.

I should denote snp.id, sample.id, snp.position etc. and also create genotype

add.gdsn(newfile, "sample.id", sample.id)
add.gdsn(newfile, "snp.id", snp.id)
add.gdsn(newfile, "snp.position", snp.position)
add.gdsn(newfile, "snp.allele", c("A/G", "T/C", ...))
var.geno <- add.gdsn(newfile, "genotype",
    valdim=c(length(snp.id), length(sample.id)), storage="bit2")

What I understand is sample.id is the vector of all the user ids, snp.id is the vector of all snps and so on. So, in genotype part how would I indicate that user x's snp id y is AA? What kind of a matrix is it?

My second question is how should I compute reference alleles, should I compute it on my 500 people population or should I check them from somewhere else, if its where do you suggest?

Thank you so much.

reference-alleles 23andme gds snprelate • 2.6k views
ADD COMMENT
0
Entering edit mode
9.0 years ago
Neilfws 49k

First part of question: to convert to GDS, I would try first converting the 23andme data to VCF. A few tools claim to do this; the best I've found is here.

Then you can try snpgdsVCF2GDS() in the SNPRelate package to convert VCF to GDS.

ADD COMMENT

Login before adding your answer.

Traffic: 1487 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6