# Intersection Of Multiple Rows In Single DataFrame

## 02 June 2019 - 1 answer

I have a DataFrame of Temperature 1000s of rows(Time series data) and 40 columns(40 points in a catchment ). Entries in this DataFrame are zeros and one (1 means active part of catchment and zero means non-active part). I want to place number of intersected values in a separate column(named inter) in the same DataFrame .

I expect the output in this way [attached image]

• value in the first row of inter should be zero as all entries are zero and no part is active on day first

• value in the 2nd row of inter should be 4 as four parts are active on day 2.

• value in the 3rd row of inter should be 3 (number of intersected values of all above rows including 3rd row)[enter image description here]. Green boxes in image show the value for 3rd row

• value in 4th row of inter should be number of intersected values of all above rows (yellow shaded area in the image).

• similarly blue boxes show the value for 5th row and red boxes show the value for sixth row and so on

Note: for every row I will count the intersection of all above rows I deserve a reward for this :) Here is you answer:

``````import pandas as pd
import numpy as np

# setup test data
data = {'0': [0, 0, 0, 1, 0], '1': [0, 0, 1, 0, 1], '2': [0, 0, 0, 1, 0], '3': [0, 0, 1, 1, 1], '4': [0, 1, 1, 1, 0]
, '5': [0, 0, 0, 0, 1], '6': [0, 1, 1, 1, 0], '7': [0, 0, 1, 0, 1], '8': [0, 1, 0, 1, 0], '9': [0, 1, 1, 0, 0],
'10': [0, 0, 1, 0, 0], '11': [0, 0, 0, 1, 1], '12': [0, 0, 0, 1, 1]}
data = pd.DataFrame(data=data)

# collect inter data
inter_data = []
for main_index, main_row in data.iterrows():

# select data for calculations
selected_data = data.loc[0:main_index,:]

# handle firs row with 0 values
if not 1 in main_row.values:
inter_data.append(0)
else:
# handle second row
if selected_data.shape == 2:
inter_data.append(selected_data[1:2].values.sum())
# handle rest of data
else:
# drop last row from selected data
selected_data = selected_data[:-1]
# sum selected data
summed_data = 0
for index, row in selected_data.iterrows():
summed_data += row.values

# get position of 1
positions = np.where(main_row.values == 1)
# get summed data based on position
positions_data = summed_data[positions]
# sum occurance in data
inter_data.append((positions_data >= 1).sum())

# add inter data to raw data
data['inter'] = pd.DataFrame(inter_data)
``````

Output:

``````   0  1  2  3  4  5  6  7  8  9  10  11  12  inder
0  0  0  0  0  0  0  0  0  0  0   0   0   0      0
1  0  0  0  0  1  0  1  0  1  1   0   0   0      4
2  0  1  0  1  1  0  1  1  0  1   1   0   0      3
3  1  0  1  1  1  0  1  0  1  0   0   1   1      4
4  0  1  0  1  0  1  0  1  0  0   0   1   1      5
``````