Transforming big file
1
0
Entering edit mode
3.4 years ago
mel22 ▴ 100

Hello, I have a very large text file with 500 000 columns and 220 rows inside each columns i have 4 different values like this ```

Sample_name       "VAR1"                                "VAR2"               
Sample1           "-0.0570 0.0113 1.035 0.061"          " -0.3631 0.0065 0.842 0.045" 
Sample2           "-0.0334 1.0000 0.013 0.813"          "-0.0604 0.9639 0.052 0.764"
  

Now i want to transform it like that

Var_name   Sample_name            V1         V2         V3     V4      
VAR1          Sample1             -0.0570  0.0113    1.035    0.061   
VAR2          Sample1            -0.3631  0.0065     0.842    0.045   
VAR1          Sample2            -0.0334   1.0000     0.013    0.813

Any idea about how can I do this please ?

SNP • 1.2k views
ADD COMMENT
3
Entering edit mode
3.4 years ago

If your data.frame is named df

  Sample_name                       VAR1                       VAR2
1     Sample1 -0.0570 0.0113 1.035 0.061 -0.3631 0.0065 0.842 0.045
2     Sample2 -0.0334 1.0000 0.013 0.813 -0.0604 0.9639 0.052 0.764

And then using tidyr:

#install.packages("tidyr")
library(tidyr)
df <- pivot_longer(df, cols = !Sample_name)
df <- separate(df, col = value, sep = " ", into = c("V1","V2","V3","V4"))

The resulting data.frame will look like:

# A tibble: 4 x 6
  Sample_name name  V1      V2     V3    V4   
  <chr>       <chr> <chr>   <chr>  <chr> <chr>
1 Sample1     VAR1  -0.0570 0.0113 1.035 0.061
2 Sample1     VAR2  -0.3631 0.0065 0.842 0.045
3 Sample2     VAR1  -0.0334 1.0000 0.013 0.813
4 Sample2     VAR2  -0.0604 0.9639 0.052 0.764
ADD COMMENT
0
Entering edit mode

It's a transpose. The function is called t() and doesn't need tidyr or tibbles or pivots.

ADD REPLY
2
Entering edit mode

No, that is not the right solution. Please re-read OPs question. Besides the first column, their data.frame has columns with values that they would like separated out by a space (and into four subsequent columns.) But let's pretend that wasn't the case:

df <- data.frame(
  Sample_name = c("Sample1", "Sample2"),
  VAR1 = c("-0.0570 0.0113 1.035 0.061", "-0.0334 1.0000 0.013 0.813"),
  VAR2 = c("-0.3631 0.0065 0.842 0.045", "-0.0604 0.9639 0.052 0.764")
)

df <- t(df)

           [,1]                         [,2]                        
Sample_name "Sample1"                    "Sample2"                   
VAR1        "-0.0570 0.0113 1.035 0.061" "-0.0334 1.0000 0.013 0.813"
VAR2        "-0.3631 0.0065 0.842 0.045" "-0.0604 0.9639 0.052 0.764"

We are left with sample names as a row. And with the other manipulations we would need, it would equal the same steps as I posted.

ADD REPLY
0
Entering edit mode

Yes exactly , the problem its not only transpose but also to separate columns . Thank you very much , I will test this solution

ADD REPLY

Login before adding your answer.

Traffic: 2711 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6