Question: using snprelate on 23andme dataset
1
gravatar for abims
4.6 years ago by
abims10
Turkey
abims10 wrote:

I will use 23andme raw data in snprelate, my data is like that, and there 500 individuals:

# rsid    chromosome    position    genotype
rs4477212    1    82154    AA
rs3094315    1    752566    AA
rs3131972    1    752721    GG

But i need to first convert it to gds format.

I should denote snp.id, sample.id, snp.position etc. and also create genotype

add.gdsn(newfile, "sample.id", sample.id)
add.gdsn(newfile, "snp.id", snp.id)
add.gdsn(newfile, "snp.position", snp.position)
add.gdsn(newfile, "snp.allele", c("A/G", "T/C", ...))

 

.....

var.geno <- add.gdsn(newfile, "genotype",
    valdim=clengthsnp.id), lengthsample.id)), storage="bit2")

 

What I understand is sample.id is the vector of all the user ids, snp.id is the vector of all snps and so on. So, in genotype part how would i indicate that user x's snp id y is AA ? What kind of a matrix is it?

My second question is how should I compute reference alleles, should I compute it on my 500 people population or should I check them from somewhere else, if its where do you suggest?

Thank you so much.

 

 

ADD COMMENTlink modified 4.6 years ago by Neilfws48k • written 4.6 years ago by abims10
0
gravatar for Neilfws
4.6 years ago by
Neilfws48k
Sydney, Australia
Neilfws48k wrote:

First part of question: to convert to GDS, I would try first converting the 23andme data to VCF. A few tools claim to do this; the best I've found is here.

Then you can try snpgdsVCF2GDS() in the SNPRelate package to convert VCF to GDS.

ADD COMMENTlink written 4.6 years ago by Neilfws48k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 2143 users visited in the last hour