Question: Transforming big file
0
gravatar for melania 2282
7 weeks ago by
melania 2282100
melania 2282100 wrote:

Hello, I have a very large text file with 500 000 columns and 220 rows inside each columns i have 4 different values like this ```

Sample_name       "VAR1"                                "VAR2"               
Sample1           "-0.0570 0.0113 1.035 0.061"          " -0.3631 0.0065 0.842 0.045" 
Sample2           "-0.0334 1.0000 0.013 0.813"          "-0.0604 0.9639 0.052 0.764"
  

Now i want to transform it like that

Var_name   Sample_name            V1         V2         V3     V4      
VAR1          Sample1             -0.0570  0.0113    1.035    0.061   
VAR2          Sample1            -0.3631  0.0065     0.842    0.045   
VAR1          Sample2            -0.0334   1.0000     0.013    0.813

Any idea about how can I do this please ?

snp • 215 views
ADD COMMENTlink modified 7 weeks ago by bioinformatics2020570 • written 7 weeks ago by melania 2282100
3
gravatar for bioinformatics2020
7 weeks ago by
bioinformatics2020570 wrote:

If your data.frame is named df

  Sample_name                       VAR1                       VAR2
1     Sample1 -0.0570 0.0113 1.035 0.061 -0.3631 0.0065 0.842 0.045
2     Sample2 -0.0334 1.0000 0.013 0.813 -0.0604 0.9639 0.052 0.764

And then using tidyr:

#install.packages("tidyr")
library(tidyr)
df <- pivot_longer(df, cols = !Sample_name)
df <- separate(df, col = value, sep = " ", into = c("V1","V2","V3","V4"))

The resulting data.frame will look like:

# A tibble: 4 x 6
  Sample_name name  V1      V2     V3    V4   
  <chr>       <chr> <chr>   <chr>  <chr> <chr>
1 Sample1     VAR1  -0.0570 0.0113 1.035 0.061
2 Sample1     VAR2  -0.3631 0.0065 0.842 0.045
3 Sample2     VAR1  -0.0334 1.0000 0.013 0.813
4 Sample2     VAR2  -0.0604 0.9639 0.052 0.764
ADD COMMENTlink modified 7 weeks ago • written 7 weeks ago by bioinformatics2020570

It's a transpose. The function is called t() and doesn't need tidyr or tibbles or pivots.

ADD REPLYlink written 7 weeks ago by karl.stamm3.9k
2

No, that is not the right solution. Please re-read OPs question. Besides the first column, their data.frame has columns with values that they would like separated out by a space (and into four subsequent columns.) But let's pretend that wasn't the case:

df <- data.frame(
  Sample_name = c("Sample1", "Sample2"),
  VAR1 = c("-0.0570 0.0113 1.035 0.061", "-0.0334 1.0000 0.013 0.813"),
  VAR2 = c("-0.3631 0.0065 0.842 0.045", "-0.0604 0.9639 0.052 0.764")
)

df <- t(df)

           [,1]                         [,2]                        
Sample_name "Sample1"                    "Sample2"                   
VAR1        "-0.0570 0.0113 1.035 0.061" "-0.0334 1.0000 0.013 0.813"
VAR2        "-0.3631 0.0065 0.842 0.045" "-0.0604 0.9639 0.052 0.764"

We are left with sample names as a row. And with the other manipulations we would need, it would equal the same steps as I posted.

ADD REPLYlink modified 7 weeks ago • written 7 weeks ago by bioinformatics2020570

Yes exactly , the problem its not only transpose but also to separate columns . Thank you very much , I will test this solution

ADD REPLYlink written 7 weeks ago by melania 2282100
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1658 users visited in the last hour
_