Skip to content

Difference between count and nunique formatting #32463

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
MathieuDutSik opened this issue Mar 5, 2020 · 1 comment
Closed

Difference between count and nunique formatting #32463

MathieuDutSik opened this issue Mar 5, 2020 · 1 comment
Labels
Output-Formatting __repr__ of pandas objects, to_string

Comments

@MathieuDutSik
Copy link

The count with groupby gives the number of entries while nunique gives the number of unique entries. The problem is that we get an additional column

>>> df = pd.DataFrame({"A": [0,0,1,1,0], "B": [1,2,3,4,5]})
>>> df.groupby("A").count()
   B
A   
0  3
1  2
>>> df.groupby("A").nunique()
   A  B
A      
0  1  3
1  1  2

Why is the A column being added in nunique output?

@jbrockmendel jbrockmendel added the Output-Formatting __repr__ of pandas objects, to_string label Oct 12, 2020
@asishm
Copy link
Contributor

asishm commented Oct 12, 2020

Seems to be fixed on master.

In [1]: import pandas as pd

In [2]: df = pd.DataFrame({"A": [0,0,1,1,0], "B": [1,2,3,4,5]})

In [4]: df.groupby('A').count()
Out[4]: 
   B
A   
0  3
1  2

In [5]: df.groupby('A').nunique()
Out[5]: 
   B
A   
0  3
1  2

In [6]: pd.__version__
Out[6]: '1.2.0.dev0+652.gec8c1c4ec'

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Output-Formatting __repr__ of pandas objects, to_string
Projects
None yet
Development

No branches or pull requests

3 participants