Question: Converting Snp Genotype Data Into 0,1,2 Matrix Any Software?
2
gravatar for grittygeek
7.1 years ago by
grittygeek30
grittygeek30 wrote:

Hello,

I have SNP Genotype data and I am looking forward to convert it to matrices of 0,1,2. I found the same question in http://www.biostars.org/post/show/15399/how-to-convert-snp-genotype-data-into-012-matrix/ but I am not able to find the answer as what package/software to use. Can someone provide the link for any such conversion tools?

I am not a bioinformatics person; I am basically a software guy. I would also like to know what are the software packages (more specifically an opensource tool written in java or python) that are good in analyzing Genomic SNP data and has reasonable documentation.

I tried to run the following code. But as per the snpMatrix class (http://svitsrv25.epfl.ch/R-doc/library/snpMatrix/html/snp-class.html) I am supposed to have the matrix as SNP class object. Can some one tell me how to convert the variable "val" into an snp matrix?

In short I want to do the inverse of what has been given as solution in this thread (http://www.biostars.org/post/show/14703/updated-r-package-to-analyse-eqtl-and-tutorials-available-for-the-association-of-genetic-variants-and-gene-expression/).

Code:

val<- read.table("C:/Users/vineeth/Desktop/Data.txt") val V1 V2 V3 1 GG AA AG 2 GG AA AA 3 AA AA GG

coerce(from=val, to="numeric",strict=TRUE) [1] "c(2, 2, 1)" "c(1, 1, 1)" "c(2, 1, 3)"

snp genotyping • 8.0k views
ADD COMMENTlink modified 7.0 years ago by 1234Jc4321410 • written 7.1 years ago by grittygeek30

What is the format of the SNP genotype data you have? What would the 0,1,2 represent? Would 0 mean that the individual is homozygote on the reference allele? or on the ancestral allele? Or anything else? Do you want to conserve the phase of the genotypes? Note that if you convert your data to matrices of 0,1,2, you loose the phase of the data.

ADD REPLYlink written 7.1 years ago by Giovanni M Dall'Olio26k

Sorry to be frank I don't know much in bio informatics. My job is to to just analyze data and my first phase involves running of Random Forest.

ADD REPLYlink written 7.1 years ago by grittygeek30

You should at least know what the 0,1,2 in our output are supposed to represent, because there is more than one possible interpretation. Ask your boss: sit down with him and write a test case. Python's doctest are good in these cases.

ADD REPLYlink written 7.1 years ago by Giovanni M Dall'Olio26k
4
gravatar for 1234Jc4321
7.1 years ago by
1234Jc4321410
Quebec City
1234Jc4321410 wrote:

You can use Plink. http://pngu.mgh.harvard.edu/~purcell/plink/index.shtml

There is a command to do such transformation: plink --file data --recode12

ADD COMMENTlink written 7.1 years ago by 1234Jc4321410

Thank You, My data is in .CSV format do you know if plink can read?

ADD REPLYlink written 7.1 years ago by grittygeek30

then a command to do transformation look like plink --ped ped.csv --map map.csv --recode12

ADD REPLYlink written 2.4 years ago by BlackHole0
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1669 users visited in the last hour