pyspark.pandas.groupby.GroupBy.var#

GroupBy.var(ddof=1, numeric_only=False)[source]#

Compute variance of groups, excluding missing values.

New in version 3.3.0.

Parameters
ddofint, default 1

Delta Degrees of Freedom. The divisor used in calculations is N - ddof, where N represents the number of elements.

Changed in version 3.4.0: Supported including arbitary integers.

numeric_onlybool, default False

Include only float, int, boolean columns.

New in version 4.0.0.

Examples

>>> df = ps.DataFrame({"A": [1, 2, 1, 2], "B": [True, False, False, True],
...                    "C": [3, 4, 3, 4], "D": ["a", "b", "b", "a"]})
>>> df.groupby("A").var()
     B    C
A
1  0.5  0.0
2  0.5  0.0