Troubles with handling vector data (csv format) in R
2
1
Entering edit mode
2.4 years ago
dadaolu ▴ 10

I am trying to convert vector data in csv format from this form 1 to this form 2 and vice versa in R. This has been giving me sleepless nights. I have been doing this in wolfram in the past but I had to switch to R. I would be very grateful for your kind suggestions. Thanks

csv data R frame • 1.4k views
ADD COMMENT
1
Entering edit mode

what is represented in columns B and C of the desired output?

ADD REPLY
0
Entering edit mode

rpolicastro, they are actually cell identifiers in MS Excel. In reality, the first column represents the field of view in x-direction, second column is field of view in y-direction while the third column represents the intensity distribution of an image.

ADD REPLY
2
Entering edit mode

Looks like wide to long form. Check melt() from the reshape2 package or pivot_longer from tidyr.

ADD REPLY
3
Entering edit mode
2.4 years ago

Example data.

df <- as.data.frame(cbind(seq_len(10), replicate(5, rnorm(10, 10, 2))))
colnames(df) <- seq_len(ncol(df))

> df
    1         2         3         4         5         6
1   1 12.702893  8.850859  7.680388  7.490946  9.457605
2   2 10.770156 10.632835  8.534555  8.273824  8.423220
3   3  6.399747 11.745845  7.283831 14.609756 12.841378
4   4  9.832645  7.585846  8.680142  5.917577 10.592092
5   5 14.073382 11.741811  9.235911  9.279248 14.884535
6   6  8.802907 10.402209  8.652651 11.458196  7.249231
7   7 12.076647 12.979112 10.461552 13.131768 13.275357
8   8 11.231053  9.862153 11.059780  8.599320 12.705250
9   9  6.318269 13.033082 10.937295  9.703796  9.578383
10 10 13.427937  8.932672 10.892284 10.935992 13.070986

Tidyverse answer.

library("tidyr")

df_long <- pivot_longer(df, !1)

> df_long
# A tibble: 50 × 3
     `1` name  value
   <dbl> <chr> <dbl>
 1     1 2     12.7 
 2     1 3      8.85
 3     1 4      7.68
 4     1 5      7.49
 5     1 6      9.46
 6     2 2     10.8 
 7     2 3     10.6 
 8     2 4      8.53
 9     2 5      8.27
10     2 6      8.42
# … with 40 more rows

data.table answer (faster than tidyverse with a lot of data)

library("data.table")

setDT(df)
df_long <- melt(df, id.vars=1)

> df_long[1:10, ]
     1 variable     value
 1:  1        2 12.702893
 2:  2        2 10.770156
 3:  3        2  6.399747
 4:  4        2  9.832645
 5:  5        2 14.073382
 6:  6        2  8.802907
 7:  7        2 12.076647
 8:  8        2 11.231053
 9:  9        2  6.318269
10: 10        2 13.427937
ADD COMMENT
2
Entering edit mode
2.4 years ago

For the sake of completeness, I am adding this reply to add another method to what rpolicastro posted and considering ATpoint 's comment :

This is wide to long-form conversion and vice versa scenario. My understanding from the post is the first two columns are x-y coordinates (id variables) and the rest are image intensities at the corresponding coordinates (measure variables). You want to convert from wide to long-form and long-form to wide:

df <- as.data.frame(cbind(seq_len(10), rep(0, 10), replicate(5, rnorm(10, 10, 2))))
colnames(df) <- c("A", "B", "C", "D", "E", "F", "G")

library(reshape2)

# convert df from wide  to long form by melt() function
meltDF = melt(df, id.vars = c("A", "B"))

head(meltDF)
#   A B variable     value
# 1 1 0        C  7.499697
# 2 2 0        C 10.163781
# 3 3 0        C  8.862005
# 4 4 0        C  9.305449
# 5 5 0        C 10.820081
# 6 6 0        C 10.524560

# convert meltDF from long form back to wide form using dcast() function
castDF = dcast(meltDF, A+B ~ variable, value.var = "value")

head(castDF)
#   A B         C         D         E         F         G
# 1 1 0  7.499697  6.492901 10.196289  8.304000 10.835184
# 2 2 0 10.163781 10.269959 11.465557  7.326901  8.174221
# 3 3 0  8.862005 15.052346 12.191936 11.098902  7.913761
# 4 4 0  9.305449  8.986819 12.696276  9.643824  9.675121
# 5 5 0 10.820081  8.908355  8.291135 10.148121  7.451699
# 6 6 0 10.524560  8.256329 10.242652 10.824888  9.351173
ADD COMMENT
1
Entering edit mode

From what I gathered from OP It would be meltDF <- melt(df, id.vars ="A") and castDF <- dcast(meltDF, A ~ variable, value.var = "value"), since the column names represent the y coordinates.

As a side note reshape2 has retired by the author and folded into tidyr [source].

ADD REPLY

Login before adding your answer.

Traffic: 2764 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6