Question: Transforming Plink Files
0
gravatar for FFK534
4.2 years ago by
FFK5340
United States
FFK5340 wrote:

Does anyone have any suggestions for combining .ped & .map files from Plink and transforming them into a different format? For example, the peds look like this:

#FID IID PAT MAT SEX STATUS G1 G2 G3 G4 G5 G6 G7 G8 G9 G10
12322 12322A 0 0 1 1 1 1 1 1 1 1 1 1 2 1
12322 12322B 0 0 2 0 1 1 1 1 2 2 2 1 1 2
12322 12322C 0 0 2 1 2 1 1 1 1 1 1 1 1 1

 

and the maps look like this: 

#CHR G GD BP
1 1_135195_A/G 0 135195
1 1_135203_G/A 0 135203
1 1_136596_GGGG/- 0 136596
1 1_136604_G/C 0 136604
1 1_136619_G/A 0 136619
1 1_136620_C/T 0 136620
1 1_136635_T/G 0 136635
1 1_136645_G/- 0 136645
1 1_136652_A/G 0 136652
1 1_136779_G/A 0 136779

 

And what I'd like is this:  

1_135195_A/G 1_135203_G/A 1_136596_GGGG/- 1_136604_G/C 1_136619_G/A 1_136620_C/T 1_136635_T/G 1_136645_G/- 1_136652_A/G 1_136779_G/A STATUS
1 1 1 1 1 1 1 1 2 1 1
1 1 1 1 2 2 2 1 1 2 0
2 1 1 1 1 1 1 1 1 1 1

 

Where the 2nd column in the 2nd file becomes the header of the third file and the 6th column of the 1st file becomes the final column of the 3rd file.

transform plink R formatting • 1.1k views
ADD COMMENTlink modified 4.2 years ago by zx87547.1k • written 4.2 years ago by FFK5340
0
gravatar for RamRS
4.2 years ago by
RamRS21k
Houston, TX
RamRS21k wrote:

If you're willing to use R, read these to data frames, extract as vectors and restructure as you see fit! Also, please only use relevant tags - I don't see how SQL is relevant here.

ADD COMMENTlink written 4.2 years ago by RamRS21k

I should note there are thousands of columns in the first file and thousands of rows in the second file. How can I extract multiple vectors without listing the columns/rows individually? 

ADD REPLYlink modified 4.2 years ago • written 4.2 years ago by FFK5340

You can transpose and slice. Speaking of, try Python - that might make it a bit more flexible, but you'l have to spend more time on the logic.

ADD REPLYlink written 4.2 years ago by RamRS21k
0
gravatar for FFK534
4.2 years ago by
FFK5340
United States
FFK5340 wrote:

This works: 

ped <- read.table ("/pathtodata.ped")
map <- read.table ("/pathtodata.map")

map$V1 <- map$V3 <- map$V4 <- NULL
tmap <- t(map)
ped$V1 <- ped$V2 <- ped$V3 <- ped$V4 <- ped$V5 <- NULL
status <- ped$V6
ped$V6 <- NULL
colnames(ped) <- c(tmap)
ped$Status <- status

 

ADD COMMENTlink written 4.2 years ago by FFK5340
0
gravatar for zx8754
4.2 years ago by
zx87547.1k
London
zx87547.1k wrote:

Look into --recodeA option in plink. That would create single *.raw format. http://pngu.mgh.harvard.edu/~purcell/plink/dataman.shtml#recode

ADD COMMENTlink written 4.2 years ago by zx87547.1k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1834 users visited in the last hour