Question

convert rows to columns in r

0

Entering edit mode

4.6 years ago

APJ ▴ 40

Hi,

I have a tibble which looks like

 head(TPM_a0)
# A tibble: 6 x 3
  depmap_id  gene_name expression
  <chr>      <chr>          <dbl>
1 ACH-000956 TSPAN6          2.65
2 ACH-000429 TSPAN6          3.85
3 ACH-000857 TSPAN6          5.63
4 ACH-000783 TSPAN6          2.25
5 ACH-000963 TSPAN6          5.11
6 ACH-000812 TSPAN6          4.81

I would like to convert to a dataframe, where each row represents gene_name and each column is a depmap_id. I tried spread() function in R,

TPM_a2 <- TPM_a0 %>% spread(depmap_id, expression)

But ended up with the following error. Any ideas?

Error: Each row of output must be identified by a unique combination of keys.
Keys are shared for 64932 rows:
* 917895, 1262407
* 509207, 566047
* 1202487, 1208311, 1230683
* 1044847, 1050811, 1052435, 1052519, 1208703, 1211419
* 202075, 869539
* 293075, 1460703
* 264907, 1588831
* 503411, 569127
* 1568195, 1618959

R • 4.5k views

ADD COMMENT • link updated 4.6 years ago by Jean-Karim Heriche 27k • written 4.6 years ago by APJ ▴ 40

1

Entering edit mode

Error indicates you must be having duplicate depmap_ids for same gene. For example: You must be having something like this:

ACH-000840 TP53  4.75
ACH-000840 TP53  3.23

So when you try spreading your data frame, it does not know which value to put for gene TP53 depmap_id ACH-000840. Your key-value pair needs to be unique. Check your values at row numbers in your error message to find out which key-value pairs are not unique.

ADD REPLY • link 4.6 years ago by patelk26 ▴ 290

score 0 · Answer 1 · 2019-09-30

0

Entering edit mode

4.6 years ago

Jean-Karim Heriche 27k

Base R solution something like:

reshape(TPM_a0, idvar="gene_name", timevar="depmap_id", direction ="wide")

ADD COMMENT • link 4.6 years ago by Jean-Karim Heriche 27k