Ad

How To Increment Stage By Condition In Python Pandas With Groupby

- 1 answer

I have data coming from two Groups A and B. The task is to monitor change, and if change (Leap) is bigger that 4, the Stage is set higher by 1. The data is ordered (time series).

import pandas as pd

df = pd.DataFrame({'Group': ['A', 'A', 'A', 'A', 'A', 'A', 'B', 'B', 'B', 'B', 'B', 'B'],
                    'Leap': [1, 5, 1, 1, 5, 1, 1, 3, 5, 5, 1, 1]})

# First set Stage to 1 for all:
df['Stage'] = 1

# Function to find first leap -> set Stage to two.
def setStage2(df):
    df.loc[df['Leap'] > 4, 'Stage'] = 2
    return df

# Apply function by group:
df.groupby('Group').apply(setStage2)

First trial

This is how far I could get. The Stage should be incremental: once on Stage 2, no going back to 1. This is how the result should look like:

This is how the Stage should look like

So how to fill Stage?

Ad

Answer

Here is one solution combining groupby and transform.

import pandas as pd

df = pd.DataFrame({
    'Group': ['A', 'A', 'A', 'A', 'A', 'A', 'B', 'B', 'B', 'B', 'B', 'B'],
    'Leap': [1, 5, 1, 1, 5, 1, 1, 3, 5, 5, 1, 1]
})

df["Stage"] = df.groupby("Group").Leap.transform(lambda x: (x > 4).cumsum()) + 1

You can also use apply instead of transform in this case.

Ad
source: stackoverflow.com
Ad