Question

Split into separate columns

0

Entering edit mode

2.1 years ago

putty ▴ 40

How to split the chr and bp position into separate cols ? I have chr:bp as:

ID A1 A2 A1_freq P value
7:146019500 T C 0.42 0.334
13:65060537 C G 0.001 0.312 
5:179589868 A G 0.005 0.102

Would like to split ID into separate columns for chr and a separate column for bp ? How to do in linux? Desired output:

CHR BP ID A1 A2 A1_freq P value
7 146019500 7:146019500 T C 0.42 0.334
13 65060537 13:65060537 C G 0.001 0.312 
5 179589868 5:179589868 A G 0.005 0.102

TIA

header id split • 983 views

ADD COMMENT • link updated 2.1 years ago by cpad0112 21k • written 2.1 years ago by putty ▴ 40

0

Entering edit mode

$ awk -v OFS="\t" '{print $1,$0}' test.txt | sed -re 's/^ID\t/CHR\tBP\t/;s/:/\t/'

CHR BP  ID  A1  A2  A1_freq P   value
7   146019500   7:146019500 T   C   0.42    0.334
13  65060537    13:65060537 C   G   0.001   0.312   
5   179589868   5:179589868 A   G   0.005   0.102

ADD REPLY • link 2.1 years ago by cpad0112 21k

score 0 · Answer 1 · 2022-03-18

0

Entering edit mode

2.1 years ago

shenwei356 8.4k

Try csvtk sep:

$ csvtk sep -t -f ID -s : -n CHR,BP -R  test.tsv \
    | csvtk cut -t -f "CHR,BP,A1,A2,A1_freq,P value"
CHR     BP      A1      A2      A1_freq P value
7       146019500       T       C       0.42    0.334
13      65060537        C       G       0.001   0.312
5       179589868       A       G       0.005   0.102

ADD COMMENT • link 2.1 years ago by shenwei356 8.4k

0

Entering edit mode

Thank you for a prompt reply, but i get the error: column "ID" not existed in file: test

tia

ADD REPLY • link 2.1 years ago by putty ▴ 40

0

Entering edit mode

Is it not tab-delimited?

Then use csvtk space2tab first, and replace "P<\tab>value" with "P value".

ADD REPLY • link 2.1 years ago by shenwei356 8.4k

0

Entering edit mode

Thank you, it works if i don't use the other columns and split only ID....then i merged the two files....

ADD REPLY • link 2.1 years ago by putty ▴ 40