I am creating a groupby object from a Pandas DataFrame and want to select out all the groups with > 1 size.

Example:

                                  A  B 0  foo  0 1  bar  1 two  foo  2 3  foo  3                              

The post-obit doesn't seem to work:

                grouped = df.groupby('A') grouped[grouped.size > ane]                              

Expected Issue:

                A foo 0     two     3                              

user avatar

Mykola Zotko

xi.8k two gold badges 36 silver badges 52 bronze badges

asked Oct 31, 2012 at 21:03

user avatar

iv

  • give us a physical example, and show what y'all have tried.

    Oct 31, 2012 at 21:08

  • Hopefully some help: grouped.size().apply(lambda ten: 10>1), just I'chiliad not sure how to do this

    Oct 31, 2012 at 21:44

  • this is interesting ..for a change I have hit a area where a feature needed by me is missing in Pandas ..for long it was my understanding of it that was missing ..keen library for what I do..

    October 31, 2012 at 21:51

4 Answers 4

Help united states of america improve our answers.

Are the answers below sorted in a way that puts the all-time reply at or nearly the top?

I have plant transform to be much more efficient than filter for very big dataframes:

                  element_group_sizes = df['A'].groupby(df['A']).transform('size') df[element_group_sizes>1]                                  

Or, in one line:

                  df[df['A'].groupby(df['A']).transform('size')>1]                                  

answered Aug 2, 2018 at 16:47

user avatar

3

  • Great answer! Only yous paid attending to efficiency. Bravo!

    Apr iv, 2019 at ane:54

  • Shouldn't be element_group_sizes = df['A'].groupby('A')['A'].transform('size') instead?

    May 24, 2019 at 12:05

  • @IgorFobia no -- df['A'] volition be a Series and will no longer have a column 'A'. I suppose element_group_sizes = df.groupby('A')['A'].transform('size') would work though.

    May 24, 2019 at 15:08

As of pandas 0.12 you can do:

                  >>> grouped.filter(lambda 10: len(ten) > 1)       A  B 0  foo  0 ii  foo  2 3  foo  iii                                  

answered Aug 15, 2013 at 21:13

user avatar

v

  • What is the 'x' in this instance? Does that refer to the cavalcade which y'all used to groupby?

    Oct 17, 2013 at 23:45

  • x would exist each subgroup of the groupby functioning, which you tin examine with grouped.groups. In case of a multicolumn groupby these subgroups refer to several columns, but this is irrelevant equally len counts by the rows in pandas objects.

    Oct 18, 2013 at eight:45

  • Is there a way to get GroupBy object after filter, not a DataFrame? The only way I see now is to call groupby again, merely this seems inefficient

    Oct 27, 2015 at 15:56

  • @IvanVirabyan Worse, with categorical values the empty groups pop upwards again.

    Mar 27, 2018 at 18:32

  • grouped.filter(lambda 10: len(x.alphabetize) > 1) should exist slightly faster

    May 24, 2019 at x:37

Yous can employ the method filter and the belongings shape:

                df.groupby('A').filter(lambda x: 10.shape[0] > 1)                              

answered Feb thirteen, 2021 at 18:07

user avatar

If you all the same need a workaround:

                  In [49]: pd.concat([group for _, group in grouped if len(grouping) > ane]) Out[49]:       A  B 0  foo  0 2  foo  ii 3  foo  iii                                  

answered Nov 1, 2012 at 17:00

user avatar

2

  • :Thank you : thats what I had implemented at present but information technology would be nice to know how to do filtering on grouped objects coz that would exist independent of writing a new list comprehension for each custom filtering case.

    November i, 2012 at 17:57

  • The result #919 cited in a higher place would be a expert solution once someone implements it

    Nov 9, 2012 at 20:59

Not the answer yous're looking for? Browse other questions tagged python pandas group-past or inquire your ain question.