Ad

Replacing The First Element Of Each Group By Its Aggregation Function

- 1 answer

Suppose the following dataframe:

df = pd.DataFrame(
    {'X': ['a', 'a', 'b', 'a', 'b'],
     'Y': [2, 4, 8, 10, 5]})

which looks as:

    X   Y
0   a   2
1   a   4
2   b   8
3   a   10
4   b   5

How to replace the first element of each group by X with the respective mean?

The expected output:

    X   Y
0   a   5.33
1   a   4.00
2   b   6.50
3   a   10.00
4   b   5.00

Sorry if this is a too basic question, but I am a newbie to Python (beginning its learning).

Ad

Answer

Use GroupBy.transform for averages and set only first value per group in numpy.where with mask by Series.duplicated:

df['Y'] = np.where(df.X.duplicated(),df.Y,df.groupby("X")['Y'].transform('mean'))
print (df)
   X          Y
0  a   5.333333
1  a   4.000000
2  b   6.500000
3  a  10.000000
4  b   5.000000
    

Another solution with DataFrame.loc:

df.loc[~df.X.duplicated(), 'Y'] = df.groupby("X")['Y'].transform('mean')
Ad
source: stackoverflow.com
Ad