Pd.DataFrame.agg(np.var) Vs Pd.Series.np.var
Using np.var() in two ways to the same dataset but they are giving two different results. Do not think it's because of n & n-1 issue since it's the same numpy function to the same dataset (a Pandas Series - SAT Math Scores).
These are the two ways:
- Directly onto a Series
- Using it with a filtered DataFrame + pd.df.agg() method
However, they are giving two different results. I have read elsewhere that this could be because of the way it's being calculated i.e
Hope for some confirmation/clarification. I am puzzled as I am using the same function np.var() for both occasions:
- Variance: 7068.194540561321
- Std.Deviation: 84.07255521608297
- Variance: 7209.558431
- Std.Deviation: 84.909119
Based on the source code, this seems like a bug.
pd.Series.agg gets a function object, it looks it up in its predefined list of cython functions:
# pandas.core.base line:555 f = self._is_cython_func(arg) # pandas.core.base line:639 def _is_cython_func(self, arg): """ if we define an internal function for this argument, return it """ return self._cython_table.get(arg)
pd.Series._cython_table OrderedDict([(<function sum(iterable, start=0, /)>, 'sum'), ... (<function numpy.var(a, axis=None, dtype=None, out=None, ddof=0, keepdims=<no value>)>,'var'),
f == self._is_cython_func(arg) == 'var'
This then gets used at
# pandas.core.base line 556 if f and not args and not kwargs: return getattr(self, f)(), None
getattr(pd.Series, 'var') <function pandas.core.series.Series.var(self, axis=None, skipna=None, level=None, ddof=1, numeric_only=None, **kwargs)>
And there is the culprit!
ddof is now 1.
- → What are the pluses/minuses of different ways to configure GPIOs on the Beaglebone Black?
- → Django, code inside <script> tag doesn't work in a template
- → React - Django webpack config with dynamic 'output'
- → GAE Python app - Does URL matter for SEO?
- → Put a Rendered Django Template in Json along with some other items
- → session disappears when request is sent from fetch
- → Python Shopify API output formatted datetime string in django template
- → Shopify app: adding a new shipping address via webhook
- → Shopify + Python library: how to create new shipping address
- → shopify python api: how do add new assets to published theme?
- → Access 'HTTP_X_SHOPIFY_SHOP_API_CALL_LIMIT' with Python Shopify Module