Question

Averaging multiple columns

1

Entering edit mode

4.8 years ago

vinayjrao ▴ 250

Hi,

I have a file with five columns, from which I want the mean of the last four columns where the first rows of the first column are the same. For example, my file looks like this -

The desired output is -

A 1 1.7 1.3 1.7
B 5 7 2 4
C 2 1.5 4 3.5

Any help would be appreciated.

Thanks.

R shell • 1.1k views

ADD COMMENT • link updated 4.8 years ago by zx8754 11k • written 4.8 years ago by vinayjrao ▴ 250

1

Entering edit mode

4.8 years ago

zx8754 11k

Using R:

# example data
df1 <- read.table(text = "
A 1 1 1 1
A 1 3 1 1
A 1 1 2 3
B 5 7 2 4
C 2 1 5 1
C 2 2 3 6
", header = FALSE)

aggregate(df1[ -1 ], df1[ 1 ], FUN = mean)
#    V1 V2       V3       V4       V5
# 1  A  1 1.666667 1.333333 1.666667
# 2  B  5 7.000000 2.000000 4.000000
# 3  C  2 1.500000 4.000000 3.500000

ADD COMMENT • link 4.8 years ago by zx8754 11k

0

Entering edit mode

Thank you for your reply. I am still working on it.

Will update once done.

ADD REPLY • link 4.8 years ago by vinayjrao ▴ 250

score 2 · Accepted Answer · 2019-06-26

2

Entering edit mode

4.8 years ago

GouthamAtla 12k

import pandas as pd
df = pd.read_csv("tmp.txt", sep="\t", header=None)
df
    0   1   2   3   4
0   A   1   1   1   1
1   A   1   3   1   1
2   A   1   1   2   3
3   B   5   7   2   4
4   C   2   1   5   1
5   C   2   2   3   6

df.groupby(0).mean()

1   2   3   4
0               
A   1.0 1.666667    1.333333    1.666667
B   5.0 7.000000    2.000000    4.000000
C   2.0 1.500000    4.000000    3.500000

df.groupby(0).mean().to_csv("tmp_mean.txt", sep="\t", header = None)