Question

how to duplicate a column in different files with bash

1

Entering edit mode

19 months ago

Khaleesi95 ▴ 40

Hi all, I have the following set of files:

BMI.0_transf.txt    
ALB.0_transf.txt
ALT.0_transf.txt
APOA.0_transf.txt
APOB.0_transf.txt
AST.0_transf.txt

Each file is organised in the following way (I'll use the first file as an example)

  ID    BMI.0_transf
1000015 0.221001822645847
1000027 -0.67967683719281
1000039 0.693446528768914
1000040 -0.97642697854059
1000053 1.28602376376819
1000064 -2.23464488923183
1000071 1.42851447024267
1000088 -0.919060898470425
1000096 0.169676433701405

Now, I'd like to duplicate the first column for all these files, so as to obtain the following output (still using the first file as an example):

ID ID BMI.0_transf
1000015 1000015 0.221001822645847
1000027 1000027 -0.67967683719281
1000039 1000039 0.693446528768914
1000040 1000040 -0.97642697854059
1000053 1000053 1.28602376376819
1000064 1000064 -2.23464488923183
1000071 1000071 1.42851447024267
1000088 1000088 -0.919060898470425
1000096 1000096 0.169676433701405

Then, I'd like to rename the first column in this way:

FID IID BMI.0_transf
1000015 1000015 0.221001822645847
1000027 1000027 -0.67967683719281
1000039 1000039 0.693446528768914
1000040 1000040 -0.97642697854059
1000053 1000053 1.28602376376819
1000064 1000064 -2.23464488923183
1000071 1000071 1.42851447024267
1000088 1000088 -0.919060898470425
1000096 1000096 0.169676433701405

I'm trying to get exercised in bash, and I've tried to do the following steps:

I've created a file, names.txt, containing the name of the second column in the old version of each file (I'll report an example of the content of this file):
```
BMI.0_transf   
ALB.0_transf
ALT.0_transf
APOA.0_transf
APOB.0_transf
AST.0_transf
```
I've written the following commands to obtain the wanted output for all the files in my folder:

    #!/bin/bash

    file_in=".../names.txt"

    line=$(head -n $file_in) N=$(echo ${line}) #name to use

    for file in *; do awk '{$1=$1 OFS $1} 1' $file | sed -i -e "1 { r"<(echo ' FID IID $N')" d }" $file   ; done

However, it didn't work. Any suggestion to fix the code? Thank you so much!

bash • 710 views

ADD COMMENT • link updated 13 months ago by Zhitian Wu ▴ 60 • written 19 months ago by Khaleesi95 ▴ 40

score 3 · Answer 1 · 2022-09-10

going for awk is already the good approach !

without simply giving you the 'solution', here are some pointers:

awk '{ print $1 $1 $2 }' <file> > <new file> will duplicate the first column

adding for instance (NR ==1 ) print 'F$1 I$1 $2 ' will add the F & I only for the first line ( the NR == 1 part)

now it's up to you to fill in the correct syntax and tweak it for your specific purpose

score 0 · Answer 2 · 2022-10-26

awk '{$1=$1 OFS $1} 1' $file | sed -i -e "1 { r"<(echo ' FID IID $N')" d }" $file

I guess you are not familar with "standard input/output" and "sed"?

sed -i should only be used when you are modifying the content of a file itself. Here, you

receives input from pipe (output of awk), and then
process the input by sed commands, then
redirect the output to somewhere else

so you cannot use -i

The right command for this task could be

awk '{ print $1 $1 $2 }' < $file |
sed '1 s/ID ID/FID IID/' > ${file}_new