Question: Snptest (Gen/Sample) Files To R
2
gravatar for Joey
8.0 years ago by
Joey410
Seattle
Joey410 wrote:

Does anyone know or can point me to any resource regarding how to convert SNPTEST dosage data files (GEN/SAMPLE) files so that they work in R/SAS? PLINK can read SNPTEST dosage format data but it seems that it can only perform association tests whereas I would like to perform a multinomial logistic regression.

Thanks,

-Joey


EDIT: added the example data: I was given two sets of data: a) Hapmap2 imputed -> a series of files for each chromosome in standard SNPTEST format (GEN/SAMPLE) and a *.mlinfo file i.e. 66 files in total.

The *.mlinfo files looks like the following:

SNP POS A1 A2 REF_FREQ RSQ
rs10047182 4434181 A G 0.117476853526221 0.98222786900009
rs1009345 3576288 A G 0.395093490054250 0.389054499338887

b) 1000 genomes imputed dataset: IMPUTE v2 was used to get the files. For each chromosome, I have around 40-50 chunks depending on the # os SNPs in each.

I have a chunk1_info file which has the following:

np_id rs_id position exp_freq_a1 info certainty type info_type0 concord_type0 r2_type0
--- rs58108140 10583 0.125 0.025 0.765 0 -1 -1 -1
--- rs3877545 11508 1.000 0.000 1.000 0 -1 -1 -1

A infobysample file:

concord_type0 r2_type0
0.949 0.915
0.949 0.936

and the SNP information contained in each of the chunks:

--- rs4912140 20001071 T G 0 0 1 0 1 0 0 0 1 0 0 1 0 0 1 0 0 1 0 1 0 0 0 1 0 0 1 0 0 1 0.004 0.595 0.401 0 ........

I guess what I want is a file similar to the file one gets when one uses --recodeA option in PLINK. I can use the *.raw file along with covariates to run a bunch of other models (multinomial logit or cox prop, hazards model).

Thanks,

Joey

gwas plink R • 7.0k views
ADD COMMENTlink modified 3.7 years ago by s.w.vanderlaan40 • written 8.0 years ago by Joey410
1

It could be helpful if you posted a small example of what this file format looks like, and perhaps also what you want to convert it to ("so that they work in R" is not very specific - R is pretty flexible and does not require strictly specified file formats).

ADD REPLYlink written 8.0 years ago by Gaffa490
1

Just added a link to our GWASToolKit on GitHub: https://github.com/swvanderlaan/GWASToolKit.

ADD REPLYlink written 2.7 years ago by s.w.vanderlaan40
1
gravatar for zx8754
7.5 years ago by
zx87548.2k
London
zx87548.2k wrote:

GenABEL package has a function impute2databel() http://www.genabel.org/packages/GenABEL

For analysis try ProbABEL http://www.genabel.org/manuals/ProbABEL

This post might be helpful http://biostar.stackexchange.com/questions/6499/how-to-analyze-imputed-gwas-data

ADD COMMENTlink modified 7.5 years ago • written 7.5 years ago by zx87548.2k
1
gravatar for s.w.vanderlaan
3.7 years ago by
Netherlands
s.w.vanderlaan40 wrote:

Any interest for this issue? We made a bash- and a perl-script to convert impute2 data to (plink) style dosage data. If needed I can post a link to the scripts.

By popular demand. Here is the link to our beta-version of 'GWASToolKit': https://github.com/swvanderlaan/GWASToolKit. You can use one of the two files named 'convert_impute2dosage.pl' or 'convert_impute2dosage.sh' to convert impute2 data to dosages.

ADD COMMENTlink modified 2.7 years ago • written 3.7 years ago by s.w.vanderlaan40
2

Please do, if someone's future google adventure leads here they might order a hit on you if this is the last post.

ADD REPLYlink written 3.7 years ago by Zaag720
1

Just added a link to our GWASToolKit on GitHub: https://github.com/swvanderlaan/GWASToolKit.

ADD REPLYlink written 2.7 years ago by s.w.vanderlaan40
1

I would be very interested in seeing your bash and perl scripts for this data conversion.

I had been looking into using Gtool for it but a simple bash script would be preferable in my eyes.

ADD REPLYlink written 3.0 years ago by jlwebb20

Just added a link to our GWASToolKit on GitHub: https://github.com/swvanderlaan/GWASToolKit.

ADD REPLYlink written 2.7 years ago by s.w.vanderlaan40
0
gravatar for Michael Dondrup
7.8 years ago by
Bergen, Norway
Michael Dondrup46k wrote:

It seems like you can use read.table or scan as with any tabular text format. read ?read.table and ?scan. Use scan if read.table takes too long.

ADD COMMENTlink written 7.8 years ago by Michael Dondrup46k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 872 users visited in the last hour