Ad

How To Assign Binary Values To Values In A Csv Column In Python?

- 1 answer

I have a dataframe index and want to add a column dummy with ones and zeros depending on the value of the index. The data frame looks like:

        Date        index_value
0   0   8/1/2003    -0.33
1   1   9/1/2003    -0.37
2   2   10/1/2003   -0.42
3   3   11/1/2003    0.51
4   4   12/1/2003   -0.51
5   5   1/1/2004    -0.49
6   6   2/1/2004     0.68
7   7   3/1/2004    -0.58
8   8   4/1/2004    -0.57
9   9   5/1/2004    -0.47
10  10  6/1/2004    -0.67
11  11  7/1/2004    -0.59
12  12  8/1/2004     0.6
13  13  9/1/2004    -0.63
14  14  10/1/2004   -0.48
15  15  11/1/2004   -0.55
16  16  12/1/2004   -0.64
17  17  1/1/2005     0.68
18  18  2/1/2005    -0.81
19  19  3/1/2005    -0.68
20  20  4/1/2005    -0.48
21  21  5/1/2005    -0.48

and I want to create a dummy that gives a 1 if the index value is greater than 0.5 and 0 in other case. My code so far is:

df = pd.read_csv("index.csv", parse_dates=True)
df['dummy']=df['index_value']...

df = ....to_csv("indexdummy.csv")

But have now idea how to assign a dummy variable. My expected output for the column dummy would be: 0 0 0 1 0 0 1 0 0 0 0 0 1 0 0 0 0 1 0 0 0 0

Ad

Answer

Compare column name by Series.gt and cast mask to integers:

df['dummy'] = df['index_value'].gt(.5).astype(int)
#alternative
#df['dummy'] = np.where(df['index_value'].gt(.5),1,0)

#if need compare index values
#df['dummy'] = (df.index > .5).astype(int)  
print (df)
            Date  index_value  dummy
0  0    8/1/2003        -0.33      0
1  1    9/1/2003        -0.37      0
2  2   10/1/2003        -0.42      0
3  3   11/1/2003         0.51      1
4  4   12/1/2003        -0.51      0
5  5    1/1/2004        -0.49      0
6  6    2/1/2004         0.68      1
7  7    3/1/2004        -0.58      0
8  8    4/1/2004        -0.57      0
9  9    5/1/2004        -0.47      0
10 10   6/1/2004        -0.67      0
11 11   7/1/2004        -0.59      0
12 12   8/1/2004         0.60      1
13 13   9/1/2004        -0.63      0
14 14  10/1/2004        -0.48      0
15 15  11/1/2004        -0.55      0
16 16  12/1/2004        -0.64      0
17 17   1/1/2005         0.68      1
18 18   2/1/2005        -0.81      0
19 19   3/1/2005        -0.68      0
20 20   4/1/2005        -0.48      0
21 21   5/1/2005        -0.48      0
Ad
source: stackoverflow.com
Ad