Question: cut columns in txt tab separated file
0
gravatar for ulises.rodriguez
4 months ago by
ulises.rodriguez0 wrote:

Hi, I have a table with <tab> separated columns, and I would like to separate it into individual columns keeping the name of the column and the name of the rows

I have tried in command line linux with cut

cat  table.txt | cut -f1 -d\t

But it does not work, I have also tried with split in R

separate(data = table ,col = "a",into = "a", sep = "\t")

this is a sample of the table

            a   b   c   d   e   f   g
Organism_1  1   1   1   1   1   1   1
Organism_2  1   1   1   1   1   1   1
Organism_3  1   1   1   1   1   0   1
Organism_4  1   1   1   1   1   1   1
Organism_5  1   1   1   1   1   1   1
Organism_6  1   1   1   1   1   1   1
Organism_7  1   1   1   1   1   1   1
Organism_8  1   1   1   1   1   1   1
Organism_9  1   1   1   1   1   0   1
Organism_10 0   1   1   1   1   1   1
Organism_11 1   1   1   1   1   1   1
Organism_12 1   1   1   1   1   1   1
Organism_13 1   1   1   1   1   1   1
Organism_14 1   0   1   1   1   1   1
Organism_15 1   1   1   1   1   1   1
Organism_16 1   1   1   1   1   1   1
Organism_17 1   1   1   1   1   1   1
Organism_18 1   1   1   1   1   1   1
Organism_19 1   1   1   1   1   1   1
Organism_20 1   1   1   1   1   1   1
Organism_21 0   0   0   0   0   0   1
Organism_22 1   1   1   1   1   1   1
Organism_23 1   1   1   1   1   1   1
Organism_24 1   1   1   1   1   0   1

the output I want is a table for each column like the next

                a
    Organism_1  1
    Organism_2  1
    Organism_3  1
    Organism_4  1
    Organism_5  1
    Organism_6  1
    Organism_7  1
    Organism_8  1
    Organism_9  1
    Organism_10 0
    Organism_11 1
    Organism_12 1
    Organism_13 1
    Organism_14 1
    Organism_15 1
    Organism_16 1
    Organism_17 1
    Organism_18 1
    Organism_19 1
    Organism_20 1
    Organism_21 0
    Organism_22 1
    Organism_23 1
    Organism_24 1

I hope someone can help me, thanks

bash linux R • 276 views
ADD COMMENTlink modified 4 months ago by RamRS20k • written 4 months ago by ulises.rodriguez0
1

only

 cut -f1

because the default delimiter is tab.

if you really want to set the delimiter: cut -f 1 -d $'\t'

ADD REPLYlink written 4 months ago by Pierre Lindenbaum116k
1

Tab is the default delimiter for cut so why not try cut -f1,2 then cut -f1,3 etc? What does that get you?

ADD REPLYlink modified 4 months ago • written 4 months ago by genomax62k
1

Your expected output is not clear. Do you want 7 different files, one per each column?

ADD REPLYlink written 4 months ago by zx87546.5k
5
gravatar for cpad0112
4 months ago by
cpad011211k
India
cpad011211k wrote:

Try this on OP data: Replace 8 wth number of columns if you have more columns.

$ for i in $(seq 2 8); do cut -f1,$i test.txt > test$i.txt ; done

assumtions:

  1. There are 8 columns and 1st column is organism column
  2. Each column is separated by tab

Output: output will be 7 files: file2.txt, file3.txt, so on.Output from test8.txt:

$ cat test8.txt 
organism    g
Organism_1  1
Organism_2  1
Organism_3  1
Organism_4  1
Organism_5  1
Organism_6  1
Organism_7  1
Organism_8  1
Organism_9  1
Organism_10 1
Organism_11 1
Organism_12 1
Organism_13 1
Organism_14 1
Organism_15 1
Organism_16 1
Organism_17 1
Organism_18 1
Organism_19 1
Organism_20 1
Organism_21 1
Organism_22 1
Organism_23 1
Organism_24 1

you can also try:

$ for i in $(seq 2 8); do  awk -v I=$i -F "\t" '{print $1,$I }' test.txt > file$i.txt;done

with gnu-parallel (remove dry-run option if you want to execute the command):

$ parallel --dry-run "cut -f1,{} test.txt > test{}.txt" ::: $(seq 2 8)
cut -f1,2 test.txt > test2.txt
cut -f1,3 test.txt > test3.txt
cut -f1,4 test.txt > test4.txt
cut -f1,5 test.txt > test5.txt
cut -f1,6 test.txt > test6.txt
cut -f1,7 test.txt > test7.txt
cut -f1,8 test.txt > test8.txt
ADD COMMENTlink modified 4 months ago • written 4 months ago by cpad011211k
1
gravatar for RamRS
4 months ago by
RamRS20k
Houston, TX
RamRS20k wrote:

This looks like an R data file with colnames and rownames, so you'll either need a read.table(..., row.names=1, header=T) or use awk, printing $1, $n. I have a feeling awk will be able to deal with the blank top-left value better than cut will.

ADD COMMENTlink written 4 months ago by RamRS20k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 755 users visited in the last hour