# Perform Calculations Based On Signals In Array

## 16 August 2019 - 1 answer

I have two columns - a 'close' column and a 'signals' column in an array. I would like to perform calculations on data in the 'close' column based on classified data that is in the 'signals' column. If the same signal appears consecutively (ignoring NANs) then do nothing, only perform a calculation when the 'signals' data at n+t index is opposite of the preceding 'signals' data at index n.

This is for a rudimentary back-testing code to prove the ability of an algorithm I have logically came up with. I understand that a for-loop is likely needed to apply properly but am not sure how to do so correctly when trying to apply to specific index points of the data.

PSEUDOCODE

``````for n in signals:
if signals == 1:
if 'signals' n+t == 1 maintain 'close' at n index point:
when 'signals' n+t == 2
return ['close'(n+t) - 'close'(n)] in 'calculations' at index n+t
``````

Here is an output I am looking to attain via a programmatic approach.

``````   close  signals  calculations
0  100    NAN      NAN
1  105    1        NAN
2  110    NAN      NAN
3  107    1        NAN
4  115    NAN      NAN
5  120    2        15
``````

Thanks for any help and please let me know if any clarification is needed!

One way might be:

1. Extract rows where "signals" are not null using `dropna`
2. Remove consecutive duplicates using `shift`
3. Set output column: if signal = 2, set `close` difference, else: set `NaN`. I use `np.where()`
4. Update this column to the input dataframe using `join`

Here the code:

``````# Import modules
import pandas as pd
import numpy as np

# Build dataset
data = [[10,  np.NaN,  ],
[105, 1,       ],
[110, np.NaN,  ],
[107, 1,       ],
[115, np.NaN,  ],
[120, 2,       ]]
df = pd.DataFrame(data, columns=["close", "signals"])

# Select rows where "signals" not null and remove duplicates
sub_df = df.dropna(subset=['signals'])

# Remove consecutive duplicates
sub_df = sub_df.loc[sub_df.signals.shift() != sub_df.signals]

# If signal == 2, set diff between close and previous close
# Else: set NaN
sub_df['output'] = np.where(sub_df.signals == 2, sub_df.close - sub_df.close.shift(), np.NaN)
print(sub_df)
#    close  signals  output
# 1    105      1.0     NaN
# 5    120      2.0    15.0

# Update dataframe with the new column
print(df.join(sub_df['output']))
#    close  signals  output
# 0     10      NaN     NaN
# 1    105      1.0     NaN
# 2    110      NaN     NaN
# 3    107      1.0     NaN
# 4    115      NaN     NaN
# 5    120      2.0    15.0
``````