Ad

Get Values By Condition From Different Column And Index

- 1 answer

Given df that represent an events of users.

index   id  action_id   feature session_id  n_page duration
1       1    null       null    1_1         1      1
2       1    3          a       1_1         2      1
3       1    null               1_1         3      1
4       1    null       pay     1_1         4      1
5       1    24                 1_1         5      1
6       1    107                1_1         6      2
7       2    null               2_1         1      1
8       2    107        c       2_1         2      1
9       2    null               2_1         3      1
10      2    34         pay     2_1         4      1

I need to group by session_id and get the last values of the feature column when the action id == 3 or 107 only in session that has action_id == 34 or 24 and the n_page value by action_id

Output df:

session_id  n_page  feature sum_duration
1_1         5       a       7
2_1         4       c       4
Ad

Answer

df_group = df[["session_id", "sum_duration"]].groupby("session_id")["sum_duration"].sum().reset_index()

df_dup = df[(df["action_id"] == 3)| (df["action_id"] == 104)]["session_id","n_page","feature"]

df_dup.merge(df_group, on = "session_id", how = "inner" )

We can change joining condition based on the desired output. If this does not produce the desired output, it would be great if you provide the code used to create input data.

Ad
source: stackoverflow.com
Ad