Entering edit mode
10.5 years ago
xiaoyanyan97
•
0
I have a database like this
1 1 0 0 1 1 A A G T
2 1 0 0 1 1 A C T G
3 1 0 0 1 1 0 0 G G
4 1 0 0 1 2 A C T T
5 1 0 0 1 2 C C G T
6 1 0 0 1 2 C C T T
.ped
1 snp1 0 1
1 snp2 0 2
.map
I use the order --recodeA convert them to
FID IID PAT MAT SEX PHENOTYPE snp1_A snp2_G
1 1 0 0 1 1 2 1
2 1 0 0 1 1 1 1
3 1 0 0 1 1 NA 2
4 1 0 0 1 2 1 0
5 1 0 0 1 2 0 1
6 1 0 0 1 2 0 0
.raw
there is NA in my data, but it is not allowed in analysis. How to deal with it in plink.
Thank you.
Please clarify, why do you need to convert it to
raw
(recodeA) format? Are you going to use plink for analysis, if yes, then why conversion?because I am calculaing linear-regression with the model is not allowed Na( missing genotype),so I have to convert it to any other value.someone told me the plink can remedy the Na(missing genotype),I have found but can't succeed.Because my data come from experiment,I can‘t code NA to any value.
Still not clear why you need to convert to
raw
format. You could just useplink --file mydata --linear
, with original PEDMAP file. Plink - Linear and logistic modelssorry,it is a other model group-lasso,it's not allowed NA. before I use it,I have to convert my data(include 50kb snp and they are coded withATCG)to 0,1,2.because there are 00 in my old data,so after convert ,NA is in the new data.
00
means nocall, when converted to raw, it becomesNA
- not available. These samples need to be excluded from analysis. InR
to exclude samples:snp1_A <- my.raw[ !is.na(my.raw$snp1_A), "snp1_A"]
I have try it ,but my model is a function which is designed already.Waht I need to do is convert my data as x(it is a matrix include recoded missing value), as your method,the data will not intact。
is there a method in plink that can convert the NA base on the other snps,then the error will be lower.
Open file with notepad & replace
NA
with whatever you like.because my data is real, my genotypes are coded 0,1,2,so I could't code na( missing genotypes)with I like.