Help with awk loop
3
0
Entering edit mode
3.5 years ago

Hi everyone! I'm new to awk and i'm having troubles with a script i thought would be easier. I have a variable number of columns with numbers (e.g. 4.45), and some of the "cells" have missing values (--). I have to calculate the mean of each row, adding a new column with it. I tried something like:

awk 'BEGIN {FS=";";OFS=";";SUM=0;AVG=0;MSS=0};
NR==1 {print $0,"Average","Missing"}; NR>1 {for (i=4;i<=NF;i++) {if ($i!="--") SUM+=$i};print$0,$SUM; SUM=0}'$FILE


In that example i'm not calculating the mean but just the sum as a test, but the results are senseless. I also have to add a "Missing" column where i count the missing values of each row. What am i doing wrong? Thanks in advance!

awk • 1.1k views
1
Entering edit mode

Thank you all! The error was in the use of $as kloetzl stated ADD REPLY 4 Entering edit mode 3.5 years ago kloetzl ★ 1.1k Don't prefix any variables in awk with $ unless you want the value of the column.

BEGIN{
FS=";";
OFS=";";
print "Average;Missing"
}
NR>1{
sum=0;
c=0;
for(i=1;i<=NF;i++)
if($i != "--") { sum +=$i;
c++;
}
print sum/c, NF-c;
}

4
Entering edit mode
3.5 years ago

Solution already posted, but here is my own:

cat test
a;b;c;d;e;f;g;h
9;8;7;0;--;--;9
5;5;345;3;6;65;8
--;677;888;66;77;--;4
1;2;4;6;8;--;0;6

awk 'BEGIN {FS=";";OFS=";";SUM=0;MSS=0} {
if (NR==1) {
print "Sum\tAverage\tMissing"
} else {
for (i=1;i<=NF;i++) {
if ($i!="--") {SUM+=$(i)} else {MSS+=1}
} print SUM"\t"SUM/(NF-MSS)"\t"MSS; SUM=0; MSS=0}}' test

Sum     Average Missing
33      6.6     2
437     62.4286 0
1712    342.4   2
27      3.85714 1

3
Entering edit mode
3.5 years ago
Lisa Ha ▴ 120

Try this:

awk 'BEGIN {FS=OFS=";"};NR==1{print $0,"sum","miss"}; NR>1{sum=miss=0;for (i=1;i<=NF;i++) if($i=="NA"){miss+=1} else if ($i>0){sum+=$i} ; print \$0,sum,miss;}' file


Input file:

var1;var2;var3
1;2;3
4;NA;6
7;8;NA


Output:

var1;var2;var3;sum;miss
1;2;3;6;0
4;NA;6;10;1
7;8;NA;15;1