Ad

Is There A Way To Select All Values For A Certain Date Within Pandas With A Condition?

- 1 answer

I have a data frame(df) with as the columns company names, and as the entries their sales volume for a certain day in a certain year up until that point, cumulative. The df is indexed by dates in datetime format. I would like to select all the values belonging to 31 December for every year, and if no entry is existent for that date, I want to select the previous closest date that has a value and return a data frame with the companies and their entries for 31st December for every year or the most recent if not available. For example, if there is no entry for Amazon belonging to 12-31-2015, but there is one for 12-30-2015, I want to retrieve that entry. Currently I have the following code to retrieve the entries belonging to every 31st December every year:

end_of_year_sales = df.loc[(df.index.month==12) & (df.index.day==31)  ]

However, this correctly retrieves all the columns with the companies and the sales volume for 31st December, but I do not know how to retrieve the most recent possible values when there is no value on the 31st of December for a certain company in certain year.

Ad

Answer

This will only for for December values, but here is one trick:

df.loc[df.index.to_series().resample('y').last().values]

That is, just resample annually and get the last value. It will work regardless of the number of days you have in that month.

Ad
source: stackoverflow.com
Ad