Ad

Get Max Date For Unique Row Value

- 1 answer

Hello I have such data:

              campaign  status  d_cap
date                                 
2019-10-07  campaign_1   start    400
2019-10-13  campaign_2   start    400
2019-10-14  campaign_1  change   1000
2019-10-14  campaign_2  change    800
2019-11-10  campaign_1    stop      0
2019-11-12  campaign_2  change   2000

Required output:

              campaign  status  d_cap
date                                 
2019-11-10  campaign_1    stop      0
2019-11-12  campaign_2  change   2000

So I want to get last status and d_cap per unique campaign based on max date. I tried to fix this question by using for loop, but I think it's not the best solution.

Ad

Answer

If I understand correctly, you need:

group = pdf.groupby(["date", "campaign"]).agg({"status": "last", "d_cap": "last"}).reset_index()
# Get indexes of the max date per group
idx = pdf.groupby(['campaign'])['date'].transform(max) == pdf['date']
# Filter the df
final = pdf[idx]
Ad
source: stackoverflow.com
Ad