# Length Of A Group Within A Group (apply Groupby After A Groupby)

## 01 August 2019 - 1 answer

I am facing the next problem: I have groups (by ID) and for all of those groups I need to apply the following code: if the distances between locations within a group are within 3 meters, they need to be added together, hence a new group will be created (the code how to create a group I showed below). Now, what I want is the number of detections within a distance group, hence the length of the group.

This all worked, but after applying it to the ID groups, it gives me an error.

The code is as follows:

``````def group_nearby_peaks(df, col, cutoff=-3.00):
"""
This function groups nearby peaks based on location.
When peaks are within 3 meters from each other they will be added together.
"""
min_location_between_groups = cutoff

df = df.sort_values('Location')

return (
df.assign(
location_diff=lambda d: d['Location'].diff(-1).fillna(-9999),
NOD=lambda d: d[col]
.groupby(d["location_diff"].shift().lt(min_location_between_groups).cumsum())
.transform(len)
)
)
``````
``````def find_relative_difference(df, peak_col, difference_col):

def relative_differences_per_ID(ID_df):
return (
spoortak_df.pipe(find_difference_peaks)
.loc[lambda d: d[peak_col]]
.pipe(group_nearby_peaks, difference_col)
)

return df.groupby('ID').apply(relative_differences_per_ID)
``````

The error I get is the following:

``````ValueError: No objects to concatenate
``````

With the following example dataframe, I expect this result.

``````    ID  Location
0   1   12.0
1   1   14.0
2   1   15.0
3   1   17.5
4   1   25.0
5   1   30.0
6   1   31.0
7   1   34.0
8   1   36.0
9   1   37.0
10  2   8.0
11  2   14.0
12  2   15.0
13  2   17.5
14  2   50.0
15  2   55.0
16  2   58.0
17  2   59.0
18  2   60.0
19  2   70.0
``````

Expected result:

``````    ID  Number of detections
0   1   4
1   1   1
2   1   5
3   2   1
4   2   3
5   2   1
6   2   5

``````

Create groupID `s` for `Location` within 3 meters. Those are > 3 meters will be forced as single ID while others will be duplicated ID. Finally, groupby `ID` and `s` and `count`

``````s = df.groupby('ID').Location.diff().fillna(0).abs().gt(3).cumsum()
df.groupby(['ID',s]).ID.count().reset_index(name='Number of detections').drop('Location', 1)

Out:
ID  Number of detections
0   1                     4
1   1                     1
2   1                     5
3   2                     1
4   2                     3
5   2                     1
6   2                     4
7   2                     1
``````