Question: cut columns in txt tab separated file
0
gravatar for ulises.rodriguez
11 weeks ago by
ulises.rodriguez0 wrote:

Hi, I have a table with <tab> separated columns, and I would like to separate it into individual columns keeping the name of the column and the name of the rows

I have tried in command line linux with cut

cat  table.txt | cut -f1 -d\t

But it does not work, I have also tried with split in R

separate(data = table ,col = "a",into = "a", sep = "\t")

this is a sample of the table

            a   b   c   d   e   f   g
Organism_1  1   1   1   1   1   1   1
Organism_2  1   1   1   1   1   1   1
Organism_3  1   1   1   1   1   0   1
Organism_4  1   1   1   1   1   1   1
Organism_5  1   1   1   1   1   1   1
Organism_6  1   1   1   1   1   1   1
Organism_7  1   1   1   1   1   1   1
Organism_8  1   1   1   1   1   1   1
Organism_9  1   1   1   1   1   0   1
Organism_10 0   1   1   1   1   1   1
Organism_11 1   1   1   1   1   1   1
Organism_12 1   1   1   1   1   1   1
Organism_13 1   1   1   1   1   1   1
Organism_14 1   0   1   1   1   1   1
Organism_15 1   1   1   1   1   1   1
Organism_16 1   1   1   1   1   1   1
Organism_17 1   1   1   1   1   1   1
Organism_18 1   1   1   1   1   1   1
Organism_19 1   1   1   1   1   1   1
Organism_20 1   1   1   1   1   1   1
Organism_21 0   0   0   0   0   0   1
Organism_22 1   1   1   1   1   1   1
Organism_23 1   1   1   1   1   1   1
Organism_24 1   1   1   1   1   0   1

the output I want is a table for each column like the next

                a
    Organism_1  1
    Organism_2  1
    Organism_3  1
    Organism_4  1
    Organism_5  1
    Organism_6  1
    Organism_7  1
    Organism_8  1
    Organism_9  1
    Organism_10 0
    Organism_11 1
    Organism_12 1
    Organism_13 1
    Organism_14 1
    Organism_15 1
    Organism_16 1
    Organism_17 1
    Organism_18 1
    Organism_19 1
    Organism_20 1
    Organism_21 0
    Organism_22 1
    Organism_23 1
    Organism_24 1

I hope someone can help me, thanks

bash linux R • 217 views
ADD COMMENTlink modified 11 weeks ago by RamRS19k • written 11 weeks ago by ulises.rodriguez0
1

only

 cut -f1

because the default delimiter is tab.

if you really want to set the delimiter: cut -f 1 -d $'\t'

ADD REPLYlink written 11 weeks ago by Pierre Lindenbaum115k
1

Tab is the default delimiter for cut so why not try cut -f1,2 then cut -f1,3 etc? What does that get you?

ADD REPLYlink modified 11 weeks ago • written 11 weeks ago by genomax59k
1

Your expected output is not clear. Do you want 7 different files, one per each column?

ADD REPLYlink written 11 weeks ago by zx87546.1k
5
gravatar for cpad0112
11 weeks ago by
cpad011210k
India
cpad011210k wrote:

Try this on OP data: Replace 8 wth number of columns if you have more columns.

$ for i in $(seq 2 8); do cut -f1,$i test.txt > test$i.txt ; done

assumtions:

  1. There are 8 columns and 1st column is organism column
  2. Each column is separated by tab

Output: output will be 7 files: file2.txt, file3.txt, so on.Output from test8.txt:

$ cat test8.txt 
organism    g
Organism_1  1
Organism_2  1
Organism_3  1
Organism_4  1
Organism_5  1
Organism_6  1
Organism_7  1
Organism_8  1
Organism_9  1
Organism_10 1
Organism_11 1
Organism_12 1
Organism_13 1
Organism_14 1
Organism_15 1
Organism_16 1
Organism_17 1
Organism_18 1
Organism_19 1
Organism_20 1
Organism_21 1
Organism_22 1
Organism_23 1
Organism_24 1

you can also try:

$ for i in $(seq 2 8); do  awk -v I=$i -F "\t" '{print $1,$I }' test.txt > file$i.txt;done

with gnu-parallel (remove dry-run option if you want to execute the command):

$ parallel --dry-run "cut -f1,{} test.txt > test{}.txt" ::: $(seq 2 8)
cut -f1,2 test.txt > test2.txt
cut -f1,3 test.txt > test3.txt
cut -f1,4 test.txt > test4.txt
cut -f1,5 test.txt > test5.txt
cut -f1,6 test.txt > test6.txt
cut -f1,7 test.txt > test7.txt
cut -f1,8 test.txt > test8.txt
ADD COMMENTlink modified 11 weeks ago • written 11 weeks ago by cpad011210k
1
gravatar for RamRS
11 weeks ago by
RamRS19k
Houston, TX
RamRS19k wrote:

This looks like an R data file with colnames and rownames, so you'll either need a read.table(..., row.names=1, header=T) or use awk, printing $1, $n. I have a feeling awk will be able to deal with the blank top-left value better than cut will.

ADD COMMENTlink written 11 weeks ago by RamRS19k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 564 users visited in the last hour