Question

cut columns in txt tab separated file

0

Entering edit mode

5.6 years ago

ulises.rodriguez • 0

Hi, I have a table with <tab> separated columns, and I would like to separate it into individual columns keeping the name of the column and the name of the rows

I have tried in command line linux with cut

cat  table.txt | cut -f1 -d\t

But it does not work, I have also tried with split in R

separate(data = table ,col = "a",into = "a", sep = "\t")

this is a sample of the table

            a   b   c   d   e   f   g
Organism_1  1   1   1   1   1   1   1
Organism_2  1   1   1   1   1   1   1
Organism_3  1   1   1   1   1   0   1
Organism_4  1   1   1   1   1   1   1
Organism_5  1   1   1   1   1   1   1
Organism_6  1   1   1   1   1   1   1
Organism_7  1   1   1   1   1   1   1
Organism_8  1   1   1   1   1   1   1
Organism_9  1   1   1   1   1   0   1
Organism_10 0   1   1   1   1   1   1
Organism_11 1   1   1   1   1   1   1
Organism_12 1   1   1   1   1   1   1
Organism_13 1   1   1   1   1   1   1
Organism_14 1   0   1   1   1   1   1
Organism_15 1   1   1   1   1   1   1
Organism_16 1   1   1   1   1   1   1
Organism_17 1   1   1   1   1   1   1
Organism_18 1   1   1   1   1   1   1
Organism_19 1   1   1   1   1   1   1
Organism_20 1   1   1   1   1   1   1
Organism_21 0   0   0   0   0   0   1
Organism_22 1   1   1   1   1   1   1
Organism_23 1   1   1   1   1   1   1
Organism_24 1   1   1   1   1   0   1

the output I want is a table for each column like the next

                a
    Organism_1  1
    Organism_2  1
    Organism_3  1
    Organism_4  1
    Organism_5  1
    Organism_6  1
    Organism_7  1
    Organism_8  1
    Organism_9  1
    Organism_10 0
    Organism_11 1
    Organism_12 1
    Organism_13 1
    Organism_14 1
    Organism_15 1
    Organism_16 1
    Organism_17 1
    Organism_18 1
    Organism_19 1
    Organism_20 1
    Organism_21 0
    Organism_22 1
    Organism_23 1
    Organism_24 1

I hope someone can help me, thanks

R linux bash • 5.6k views

ADD COMMENT • link updated 5.6 years ago by Ram 43k • written 5.6 years ago by ulises.rodriguez • 0

1

Entering edit mode

only

 cut -f1

because the default delimiter is tab.

if you really want to set the delimiter: cut -f 1 -d $'\t'

ADD REPLY • link 5.6 years ago by Pierre Lindenbaum 161k

1

Entering edit mode

Tab is the default delimiter for cut so why not try cut -f1,2 then cut -f1,3 etc? What does that get you?

ADD REPLY • link 5.6 years ago by GenoMax 141k

1

Entering edit mode

Your expected output is not clear. Do you want 7 different files, one per each column?

ADD REPLY • link 5.6 years ago by zx8754 11k

score 5 · Answer 1 · 2018-09-24

Try this on OP data: Replace 8 wth number of columns if you have more columns.

$ for i in $(seq 2 8); do cut -f1,$i test.txt > test$i.txt ; done

assumtions:

There are 8 columns and 1st column is organism column
Each column is separated by tab

Output: output will be 7 files: file2.txt, file3.txt, so on.Output from test8.txt:

$ cat test8.txt 
organism    g
Organism_1  1
Organism_2  1
Organism_3  1
Organism_4  1
Organism_5  1
Organism_6  1
Organism_7  1
Organism_8  1
Organism_9  1
Organism_10 1
Organism_11 1
Organism_12 1
Organism_13 1
Organism_14 1
Organism_15 1
Organism_16 1
Organism_17 1
Organism_18 1
Organism_19 1
Organism_20 1
Organism_21 1
Organism_22 1
Organism_23 1
Organism_24 1

you can also try:

$ for i in $(seq 2 8); do  awk -v I=$i -F "\t" '{print $1,$I }' test.txt > file$i.txt;done

with gnu-parallel (remove dry-run option if you want to execute the command):

$ parallel --dry-run "cut -f1,{} test.txt > test{}.txt" ::: $(seq 2 8)
cut -f1,2 test.txt > test2.txt
cut -f1,3 test.txt > test3.txt
cut -f1,4 test.txt > test4.txt
cut -f1,5 test.txt > test5.txt
cut -f1,6 test.txt > test6.txt
cut -f1,7 test.txt > test7.txt
cut -f1,8 test.txt > test8.txt

score 1 · Answer 2 · 2018-09-24

1

Entering edit mode

5.6 years ago

Ram 43k

This looks like an R data file with colnames and rownames, so you'll either need a read.table(..., row.names=1, header=T) or use awk, printing $1, $n. I have a feeling awk will be able to deal with the blank top-left value better than cut will.

ADD COMMENT • link 5.6 years ago by Ram 43k