Ad
Split Dataframe By Certain Condition But Keep The Original Dataframe
I have a dataframe "bb" like this:
Response Unique Count
I love it so much! 246_0 1
This is not bad, but can be better. 246_1 2
Well done, let's do it. 247_0 1
If count is lager than 1, I would like to split the string and make the dataframe "bb" become this: (result I expected)
Response Unique
I love it so much! 246_0
This is not bad 246_1_0
but can be better. 246_1_1
Well done, let's do it. 247_0
My code:
bb = DataFrame(bb[bb['Count'] > 1].Response.str.split(',').tolist(), index=bb[bb['Count'] > 1].Unique).stack()
bb = bb.reset_index()[[0, 'Unique']]
bb.columns = ['Response','Unique']
bb=bb.replace('', np.nan)
bb=bb.dropna()
print(bb)
But the result is like this:
Response Unique
0 This is not bad 246_1
1 but can be better. 246_1
How can I keep the original dataframe in this case?
Ad
Answer
First split only values per condition with to new helper Series
and then add counter values by GroupBy.cumcount
only per duplicated index values by Index.duplicated
:
s = df.loc[df.pop('Count') > 1, 'Response'].str.split(',', expand=True).stack()
df1 = df.join(s.reset_index(drop=True, level=1).rename('Response1'))
df1['Response'] = df1.pop('Response1').fillna(df1['Response'])
mask = df1.index.duplicated(keep=False)
df1.loc[mask, 'Unique'] += df1[mask].groupby(level=0).cumcount().astype(str).radd('_')
df1 = df1.reset_index(drop=True)
print (df1)
Response Unique
0 I love it so much! 246_0
1 This is not bad 246_1_0
2 but can be better. 246_1_1
3 Well done! 247_0
EDIT: If need _0
for all another values remove mask:
s = df.loc[df.pop('Count') > 1, 'Response'].str.split(',', expand=True).stack()
df1 = df.join(s.reset_index(drop=True, level=1).rename('Response1'))
df1['Response'] = df1.pop('Response1').fillna(df1['Response'])
df1['Unique'] += df1.groupby(level=0).cumcount().astype(str).radd('_')
df1 = df1.reset_index(drop=True)
print (df1)
Response Unique
0 I love it so much! 246_0_0
1 This is not bad 246_1_0
2 but can be better. 246_1_1
3 Well done! 247_0_0
Ad
source: stackoverflow.com
Related Questions
- → What are the pluses/minuses of different ways to configure GPIOs on the Beaglebone Black?
- → Django, code inside <script> tag doesn't work in a template
- → React - Django webpack config with dynamic 'output'
- → GAE Python app - Does URL matter for SEO?
- → Put a Rendered Django Template in Json along with some other items
- → session disappears when request is sent from fetch
- → Python Shopify API output formatted datetime string in django template
- → Can't turn off Javascript using Selenium
- → WebDriver click() vs JavaScript click()
- → Shopify app: adding a new shipping address via webhook
- → Shopify + Python library: how to create new shipping address
- → shopify python api: how do add new assets to published theme?
- → Access 'HTTP_X_SHOPIFY_SHOP_API_CALL_LIMIT' with Python Shopify Module
Ad