Ad

Pandas Capture Connected Rows

- 1 answer

I have a following looking dataframe, e.g.:

ID      Value
0x3000  nan
nan     1
nan     2
nan     3
0x4252  nan
nan     10
nan     12

now, I'm looking for a way to get these two groups out of this dataframe, like so:

ID      Value
0x3000  nan
nan     1
nan     2
nan     3

and

ID      Value
0x4252  nan
nan     10
nan     12

so, a group basically starts on a hex value and contains its connected values all the way until the next occurence of valid hex value.

How can this be done effectively in pandas without manually looping through the rows and collecting row by row, until the condition (valid hex value) is met?

Ad

Answer

You can use groupby with a custom group to generate a list of DataFrames:

l = [g for _,g in df.groupby(df['ID'].notna().cumsum())]

output:

[       ID  Value
 0  0x3000    NaN
 1     NaN    1.0
 2     NaN    2.0
 3     NaN    3.0,
        ID  Value
 4  0x4252    NaN
 5     NaN   10.0
 6     NaN   12.0]
Ad
source: stackoverflow.com
Ad