Averaging across multiple columns
2
0
Entering edit mode
2.4 years ago
Agamemnon ▴ 80

Hi,

I would like to average values from column 3 to 24. The data structure is as follows:

#FID IID SCORE10.AVG SCORE11.AVG SCORE12.AVG SCORE13.AVG SCORE14.AVG SCORE15.AVG SCORE16.AVG SCORE17.AVG SCORE18.AVG SCORE19.AVG SCORE1.AVG SCORE20.AVG SCORE21.AVG SCORE22.AVG SCORE2.AVG SCORE3.AVG SCORE4.AVG SCORE5.AVG SCORE6.AVG SCORE7.AVG SCORE8.AVG SCORE9.AVG
4206209 2159878 7.81977e-05 6.47787e-05 -0.0021166 -0.000111422 7.81977e-05 -0.00194971 0.000976662 -0.00178663 -0.0029736 -0.00166744 -0.000402559 -0.00210566 -0.00130077 0.00122846 -0.00296566 -0.000676401 0.000319205 -0.0012993 -0.000231331 -0.00251666 -0.00246146 -0.00089255 -0.00162087
2619089 5849169 -3.36376e-07 0.000166214 -0.00229081 0.000175182 -3.36376e-07 -0.00144829 0.000844795 -0.00178175 -0.00392416 -0.00169813 -0.000339629 -0.00241469 -0.00124691 0.00117414 -0.00293646 -0.00043481 0.000182961 -0.00155656 -0.000366005 -0.00274512 -0.00220684 -0.000876328 -0.00163762
4963755 4126600 4.80562e-05 -8.03571e-06 -0.00205945 2.89793e-05 4.80562e-05 -0.00180237 0.000952638 -0.00180321 -0.00410665 -0.001539 -0.00034781 -0.00228668 -0.0012383 0.0011158 -0.00306278 -0.000557173 0.000221675 -0.00140883 -0.000314032 -0.00245025 -0.00225357 -0.000983217 -0.00144304
5292894 2300468 4.70042e-05 0.000112097 -0.00187693 1.57412e-05 4.70042e-05 -0.00186633 0.000984547 -0.00170159 -0.00414575 -0.00170298 -3.9728e-05 -0.00229207 -0.00150641 0.00120988 -0.00315069 -0.000646717 0.000189562 -0.0012411 -0.000346555 -0.00258908 -0.00250897 -0.000868924 -0.00160577
...

The aim is to get something like:

#FID IID Summed_Mean
4206209 2159878 [(sum 3-24)/22]
2619089 5849169 [(sum 3-24)/22]
4963755 4126600 [(sum 3-24)/22]
5292894 2300468 [(sum 3-24)/22]

I know awk would be the appropriate tool, but I am not sure, how to tackle it, and can't find anything on the forum that could be adapted.

awk bash column • 1.2k views
ADD COMMENT
0
Entering edit mode

Seems there is an extra column at the end i.e 25 columns and 25th column without header.

ADD REPLY
2
Entering edit mode
2.4 years ago
Agamemnon ▴ 80

Got the answer from a different forum:

awk '
NR==1 {
  print $1,$2,"Average"
  next
}
{
  sum=0
  for(i=3;i<=24;i++){
    sum+=$i
  }
  print $1,$2,sum/22
}
' merged.sscore
ADD COMMENT
2
Entering edit mode
2.4 years ago
awk '{T=0.0;N=0;for(i=3;i<=24;i++) {T+=$i;N++;} printf("%s\t%s\n",$0,T/N);}'
ADD COMMENT
0
Entering edit mode

This prints entire data and average at the end.

$ awk -v FS="\t" '{T=0.0;N=0;for(i=3;i<=24;i++) {T+=$i;N++;} printf("%s\t%s\t%s\n", $1,$2,T/N);}' will print OP required output.

ADD REPLY
0
Entering edit mode

I understand now, this adds the last column with the summed average.

ADD REPLY

Login before adding your answer.

Traffic: 2168 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6