Ad
Shifting Timeseries Data Using Per Group Using Shift() And Groupby() Results In NaN
Given the following dataset
df = pd.DataFrame(
{
'YearMo': ['01', '02', '01', '02', '01', '02'],
'Prod': ['a', 'a', 'b', 'b', 'c', 'c'],
'Value': [1, 2, 3, 4, 5, 6]
}
)
I'm trying to lag data per year/prod group. It is my understanding this can be done using shift()
after a groupby()
, as follows:
df['shifteddata'] = df.groupby(['YearMo', 'Prod'])['Value'].shift()
The result of this however is a column containing NaN.
print(df)
YearMo Prod Value shifteddata
0 01 a 1 NaN
1 02 a 2 NaN
2 01 b 3 NaN
3 02 b 4 NaN
4 01 c 5 NaN
5 02 c 6 NaN
Whereas I am expecting
YearMo Prod Value shifteddata
0 01 a 1 NaN
1 02 a 2 1
2 01 b 3 NaN
3 02 b 4 3
4 01 c 5 NaN
5 02 c 6 5
What am I missing?
Ad
Answer
I guess you are missing the fact that groups defined by .groupby(['YearMo', 'Prod'])
are groups consisting of only one item. Shift of one item returns NaN
s.
Your desired output can be reached by the following code:
df['shifteddata'] = df.groupby(['Prod'])['Value'].shift()
Ad
source: stackoverflow.com
Related Questions
- → What are the pluses/minuses of different ways to configure GPIOs on the Beaglebone Black?
- → Django, code inside <script> tag doesn't work in a template
- → React - Django webpack config with dynamic 'output'
- → GAE Python app - Does URL matter for SEO?
- → Put a Rendered Django Template in Json along with some other items
- → session disappears when request is sent from fetch
- → Python Shopify API output formatted datetime string in django template
- → Can't turn off Javascript using Selenium
- → WebDriver click() vs JavaScript click()
- → Shopify app: adding a new shipping address via webhook
- → Shopify + Python library: how to create new shipping address
- → shopify python api: how do add new assets to published theme?
- → Access 'HTTP_X_SHOPIFY_SHOP_API_CALL_LIMIT' with Python Shopify Module
Ad