Entering edit mode
                    6.1 years ago
        APJ
        
    
        ▴
    
    40
    Hi,
I have a tibble which looks like
 head(TPM_a0)
# A tibble: 6 x 3
  depmap_id  gene_name expression
  <chr>      <chr>          <dbl>
1 ACH-000956 TSPAN6          2.65
2 ACH-000429 TSPAN6          3.85
3 ACH-000857 TSPAN6          5.63
4 ACH-000783 TSPAN6          2.25
5 ACH-000963 TSPAN6          5.11
6 ACH-000812 TSPAN6          4.81
I would like to convert to a dataframe, where each row represents gene_name and each column is a depmap_id. 
I tried spread() function in R, 
TPM_a2 <- TPM_a0 %>% spread(depmap_id, expression)
But ended up with the following error. Any ideas?
Error: Each row of output must be identified by a unique combination of keys.
Keys are shared for 64932 rows:
* 917895, 1262407
* 509207, 566047
* 1202487, 1208311, 1230683
* 1044847, 1050811, 1052435, 1052519, 1208703, 1211419
* 202075, 869539
* 293075, 1460703
* 264907, 1588831
* 503411, 569127
* 1568195, 1618959
Error indicates you must be having duplicate depmap_ids for same gene. For example: You must be having something like this:
So when you try spreading your data frame, it does not know which value to put for gene TP53 depmap_id ACH-000840. Your key-value pair needs to be unique. Check your values at row numbers in your error message to find out which key-value pairs are not unique.