How To Add A Random Value To Many Rows In A Pandas Dataframe Iteratively?
Suppose I have a Pandas Dataframe named df
, which has the following structure:-
Column 1 Column 2 ......... Column 104
Row 1 0.01 0.55 3
Row 2 0.03 0.14 1
...
Row 100 0.75 0.56 0
What I am trying to accomplish is that for all rows which match the condition given below, I need to generate 100
more rows with a random value between 0
and 0.05
added to each row:-
is_less = df.iloc[:,-1] > 1
df_try = df[is_less]
df = df.append([df_try]*100,ignore_index=True)
The problem is that I can simply duplicate the rows in df_try
to generate 100
more rows for each case, but I want to add a random value to each row as well, such that each row is different from the others but very similar.
import random
df = df.append([df_try + random.uniform(0,0.05)]*100, ignore_index=True)
What this does is to simply add the fixed random value to df_try
's 100
new rows, but not a unique random value to each row. I know that this is because the above syntax does not iterate over df_try, resulting in the fixed random value being added, but is there a suitable way to add the random values iteratively over the data frame in this case?
Answer
One idea is create 2d array with same size like new appended DataFrame
and add to joined lists with concat
:
N = 10
arr = np.random.uniform(0,0.05, size=(N, len(df.columns)))
is_less = df.iloc[:,-1] > 1
df_try = df[is_less]
df = df.append(pd.concat([df_try]*N) + arr,ignore_index=True)
print (df)
Column 1 Column 2 Column 104
0 0.010000 0.550000 3.000000
1 0.030000 0.140000 1.000000
2 0.750000 0.560000 0.000000
3 0.024738 0.561647 3.045146
4 0.035315 0.584161 3.008656
5 0.022386 0.563025 3.033091
6 0.039175 0.588785 3.004649
7 0.049465 0.594903 3.003303
8 0.027366 0.580478 3.041745
9 0.044721 0.599853 3.001736
10 0.052849 0.589775 3.042434
11 0.033957 0.582610 3.045215
12 0.044349 0.582218 3.027665
Your solution should be changed by list comprehension if need add scalar to each df_try
:
N = 10
is_less = df.iloc[:,-1] > 1
df_try = df[is_less]
df = df.append( [df_try + random.uniform(0, 0.05) for _ in range(N)], ignore_index=True)
print (df)
Column 1 Column 2 Column 104
0 0.010000 0.550000 3.000000
1 0.030000 0.140000 1.000000
2 0.750000 0.560000 0.000000
3 0.036756 0.576756 3.026756
4 0.039357 0.579357 3.029357
5 0.048746 0.588746 3.038746
6 0.040197 0.580197 3.030197
7 0.011045 0.551045 3.001045
8 0.013942 0.553942 3.003942
9 0.054658 0.594658 3.044658
10 0.025909 0.565909 3.015909
11 0.012093 0.552093 3.002093
12 0.058463 0.598463 3.048463
Related Questions
- → What are the pluses/minuses of different ways to configure GPIOs on the Beaglebone Black?
- → Django, code inside <script> tag doesn't work in a template
- → React - Django webpack config with dynamic 'output'
- → GAE Python app - Does URL matter for SEO?
- → Put a Rendered Django Template in Json along with some other items
- → session disappears when request is sent from fetch
- → Python Shopify API output formatted datetime string in django template
- → Can't turn off Javascript using Selenium
- → WebDriver click() vs JavaScript click()
- → Shopify app: adding a new shipping address via webhook
- → Shopify + Python library: how to create new shipping address
- → shopify python api: how do add new assets to published theme?
- → Access 'HTTP_X_SHOPIFY_SHOP_API_CALL_LIMIT' with Python Shopify Module