Question: cut columns in txt tab separated file
0
gravatar for ulises.rodriguez
25 days ago by
ulises.rodriguez0 wrote:

Hi, I have a table with <tab> separated columns, and I would like to separate it into individual columns keeping the name of the column and the name of the rows

I have tried in command line linux with cut

cat  table.txt | cut -f1 -d\t

But it does not work, I have also tried with split in R

separate(data = table ,col = "a",into = "a", sep = "\t")

this is a sample of the table

            a   b   c   d   e   f   g
Organism_1  1   1   1   1   1   1   1
Organism_2  1   1   1   1   1   1   1
Organism_3  1   1   1   1   1   0   1
Organism_4  1   1   1   1   1   1   1
Organism_5  1   1   1   1   1   1   1
Organism_6  1   1   1   1   1   1   1
Organism_7  1   1   1   1   1   1   1
Organism_8  1   1   1   1   1   1   1
Organism_9  1   1   1   1   1   0   1
Organism_10 0   1   1   1   1   1   1
Organism_11 1   1   1   1   1   1   1
Organism_12 1   1   1   1   1   1   1
Organism_13 1   1   1   1   1   1   1
Organism_14 1   0   1   1   1   1   1
Organism_15 1   1   1   1   1   1   1
Organism_16 1   1   1   1   1   1   1
Organism_17 1   1   1   1   1   1   1
Organism_18 1   1   1   1   1   1   1
Organism_19 1   1   1   1   1   1   1
Organism_20 1   1   1   1   1   1   1
Organism_21 0   0   0   0   0   0   1
Organism_22 1   1   1   1   1   1   1
Organism_23 1   1   1   1   1   1   1
Organism_24 1   1   1   1   1   0   1

the output I want is a table for each column like the next

                a
    Organism_1  1
    Organism_2  1
    Organism_3  1
    Organism_4  1
    Organism_5  1
    Organism_6  1
    Organism_7  1
    Organism_8  1
    Organism_9  1
    Organism_10 0
    Organism_11 1
    Organism_12 1
    Organism_13 1
    Organism_14 1
    Organism_15 1
    Organism_16 1
    Organism_17 1
    Organism_18 1
    Organism_19 1
    Organism_20 1
    Organism_21 0
    Organism_22 1
    Organism_23 1
    Organism_24 1

I hope someone can help me, thanks

bash linux R • 169 views
ADD COMMENTlink modified 25 days ago by RamRS18k • written 25 days ago by ulises.rodriguez0
1

only

 cut -f1

because the default delimiter is tab.

if you really want to set the delimiter: cut -f 1 -d $'\t'

ADD REPLYlink written 25 days ago by Pierre Lindenbaum113k
1

Tab is the default delimiter for cut so why not try cut -f1,2 then cut -f1,3 etc? What does that get you?

ADD REPLYlink modified 25 days ago • written 25 days ago by genomax57k
1

Your expected output is not clear. Do you want 7 different files, one per each column?

ADD REPLYlink written 25 days ago by zx87545.4k
5
gravatar for cpad0112
25 days ago by
cpad01129.3k
India
cpad01129.3k wrote:

Try this on OP data: Replace 8 wth number of columns if you have more columns.

$ for i in $(seq 2 8); do cut -f1,$i test.txt > test$i.txt ; done

assumtions:

  1. There are 8 columns and 1st column is organism column
  2. Each column is separated by tab

Output: output will be 7 files: file2.txt, file3.txt, so on.Output from test8.txt:

$ cat test8.txt 
organism    g
Organism_1  1
Organism_2  1
Organism_3  1
Organism_4  1
Organism_5  1
Organism_6  1
Organism_7  1
Organism_8  1
Organism_9  1
Organism_10 1
Organism_11 1
Organism_12 1
Organism_13 1
Organism_14 1
Organism_15 1
Organism_16 1
Organism_17 1
Organism_18 1
Organism_19 1
Organism_20 1
Organism_21 1
Organism_22 1
Organism_23 1
Organism_24 1

you can also try:

$ for i in $(seq 2 8); do  awk -v I=$i -F "\t" '{print $1,$I }' test.txt > file$i.txt;done

with gnu-parallel (remove dry-run option if you want to execute the command):

$ parallel --dry-run "cut -f1,{} test.txt > test{}.txt" ::: $(seq 2 8)
cut -f1,2 test.txt > test2.txt
cut -f1,3 test.txt > test3.txt
cut -f1,4 test.txt > test4.txt
cut -f1,5 test.txt > test5.txt
cut -f1,6 test.txt > test6.txt
cut -f1,7 test.txt > test7.txt
cut -f1,8 test.txt > test8.txt
ADD COMMENTlink modified 25 days ago • written 25 days ago by cpad01129.3k
1
gravatar for RamRS
25 days ago by
RamRS18k
Houston, TX
RamRS18k wrote:

This looks like an R data file with colnames and rownames, so you'll either need a read.table(..., row.names=1, header=T) or use awk, printing $1, $n. I have a feeling awk will be able to deal with the blank top-left value better than cut will.

ADD COMMENTlink written 25 days ago by RamRS18k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 718 users visited in the last hour