Question: Script bash to split columns
0
gravatar for melania 2282
7 weeks ago by
melania 2282100
melania 2282100 wrote:

Hello, I have a data file with 4 values in same column. How to split each column in 4 labelled columns v1, v2,v3 and v4 with bash or R or python ?

example of my file :

Barcode        "FD1133"                              "FD1138"                
102            "-0.0570 0.0113 1.035 0.061"          " -0.3631 0.0065 0.842 0.045"
104            "-0.0334 1.0000 0.013 0.813"          "-0.0604 0.9639 0.052 0.764"

Thank you very much

bash script split • 232 views
ADD COMMENTlink modified 7 weeks ago by Mensur Dlakic8.2k • written 7 weeks ago by melania 2282100
1

melania 2282 : Please don't delete posts once they have received an answer or comments.

ADD REPLYlink written 7 weeks ago by GenoMax95k

if you can handle headers, try column -t

ADD REPLYlink written 7 weeks ago by cpad011214k

How does this help OP? column formats content for display, it does nothing to split or manipulate content in any way that awk can.

ADD REPLYlink written 7 weeks ago by _r_am32k
3
gravatar for Mensur Dlakic
7 weeks ago by
Mensur Dlakic8.2k
USA
Mensur Dlakic8.2k wrote:

Assuming that your data is saved in a file named file.txt:

echo "v1\tv2\tv3\tv4" > FD1133_columns.txt
echo "v1\tv2\tv3\tv4" > FD1138_columns.txt
grep -v Barcode file.txt | awk -F '\"' '{print $2}' | awk '{print $1"\t"$2"\t"$3"\t"$4}' >> FD1133_columns.txt
grep -v Barcode file.txt | awk -F '\"' '{print $4}' | awk '{print $1"\t"$2"\t"$3"\t"$4}' >> FD1138_columns.txt
ADD COMMENTlink written 7 weeks ago by Mensur Dlakic8.2k

thank you very much

ADD REPLYlink written 7 weeks ago by melania 2282100

If an answer was helpful, you should upvote it; if the answer resolved your question, you should mark it as accepted. You can accept more than one answer if they work. This will help future users that might find this post find the right answer.

Upvote|Bookmark|Accept

ADD REPLYlink written 7 weeks ago by _r_am32k
1
gravatar for bioinformatics2020
7 weeks ago by
bioinformatics2020570 wrote:

Using the separate function from tidyr, assuming your data.frame is named df.

if(!require("tidyr")) install.packages("tidyr")
library(tidyr)
df <- separate(df, col = "FD1133", into = c("col_1","col_2","col_3","col_4"), sep = " ")

Do this for every column you want to split (just change the col = part of the code.)

ADD COMMENTlink modified 7 weeks ago • written 7 weeks ago by bioinformatics2020570
1

separate is part of tidyr. Why install the whole tidyverse set of packages? Plus, OP wants v[0-9]+ column names, not col_[0-9]+ column names.

ADD REPLYlink modified 7 weeks ago • written 7 weeks ago by _r_am32k
1

Loading up tidyverse is my personal preference. However, I edited my initial point in case OP wants to know the exact package this function is coming from. As for the latter part of your question, I would hope OP would understand from my code that they can name the column to whatever they desire.

ADD REPLYlink written 7 weeks ago by bioinformatics2020570
2

I feel like it's an unnecessary twist to give the exact code that OP needs with one simple needless difference. We should either give pseudocode/pointers or exact code with logic-based missing steps or exact functioning code. Exact code with a radnom semantic difference achieves next to nothing wrt educating the user.

ADD REPLYlink modified 7 weeks ago • written 7 weeks ago by _r_am32k
3

Yes, I think you’re right here. I think as we progress into our programming, we tend to overlook things like this. I’ll keep this logic as I move forward with future response.

ADD REPLYlink written 7 weeks ago by bioinformatics2020570

thank you very much

ADD REPLYlink written 7 weeks ago by melania 2282100
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1645 users visited in the last hour
_