How To Filter Pandas Dataframe
I am creating a groupby
object from a Pandas DataFrame
and want to select out all the groups with > 1 size.
Example:
A B 0 foo 0 1 bar 1 two foo 2 3 foo 3
The post-obit doesn't seem to work:
grouped = df.groupby('A') grouped[grouped.size > ane]
Expected Issue:
A foo 0 two 3
Mykola Zotko
xi.8k two gold badges 36 silver badges 52 bronze badges
asked Oct 31, 2012 at 21:03
AbhiAbhi
5,653 10 gold badges 36 silver badges 55 bronze badges
iv
4 Answers 4
Help united states of america improve our answers.
Are the answers below sorted in a way that puts the all-time reply at or nearly the top?
I have plant transform
to be much more efficient than filter
for very big dataframes:
element_group_sizes = df['A'].groupby(df['A']).transform('size') df[element_group_sizes>1]
Or, in one line:
df[df['A'].groupby(df['A']).transform('size')>1]
answered Aug 2, 2018 at 16:47
SealanderSealander
2,949 3 gilded badges xviii silvery badges 19 bronze badges
3
-
Great answer! Only yous paid attending to efficiency. Bravo!
Apr iv, 2019 at ane:54
-
Shouldn't be
element_group_sizes = df['A'].groupby('A')['A'].transform('size')
instead?May 24, 2019 at 12:05
-
@IgorFobia no --
df['A']
volition be a Series and will no longer have a column'A'
. I supposeelement_group_sizes = df.groupby('A')['A'].transform('size')
would work though.May 24, 2019 at 15:08
As of pandas 0.12 you can do:
>>> grouped.filter(lambda 10: len(ten) > 1) A B 0 foo 0 ii foo 2 3 foo iii
answered Aug 15, 2013 at 21:13
elyaseelyase
36.7k xi gold badges 101 silvery badges 114 statuary badges
v
-
What is the 'x' in this instance? Does that refer to the cavalcade which y'all used to groupby?
Oct 17, 2013 at 23:45
-
x
would exist each subgroup of the groupby functioning, which you tin examine withgrouped.groups
. In case of a multicolumn groupby these subgroups refer to several columns, but this is irrelevant equallylen
counts by the rows in pandas objects.Oct 18, 2013 at eight:45
-
Is there a way to get GroupBy object after filter, not a DataFrame? The only way I see now is to call groupby again, merely this seems inefficient
Oct 27, 2015 at 15:56
-
@IvanVirabyan Worse, with categorical values the empty groups pop upwards again.
Mar 27, 2018 at 18:32
-
grouped.filter(lambda 10: len(x.alphabetize) > 1)
should exist slightly fasterMay 24, 2019 at x:37
Yous can employ the method filter
and the belongings shape
:
df.groupby('A').filter(lambda x: 10.shape[0] > 1)
answered Feb thirteen, 2021 at 18:07
Mykola ZotkoMykola Zotko
11.8k 2 gold badges 36 argent badges 52 bronze badges
If you all the same need a workaround:
In [49]: pd.concat([group for _, group in grouped if len(grouping) > ane]) Out[49]: A B 0 foo 0 2 foo ii 3 foo iii
answered Nov 1, 2012 at 17:00
Chang SheChang She
15.7k 8 aureate badges 39 silver badges 24 bronze badges
2
-
:Thank you : thats what I had implemented at present but information technology would be nice to know how to do filtering on grouped objects coz that would exist independent of writing a new list comprehension for each custom filtering case.
November i, 2012 at 17:57
-
The result #919 cited in a higher place would be a expert solution once someone implements it
Nov 9, 2012 at 20:59
Not the answer yous're looking for? Browse other questions tagged python pandas group-past or inquire your ain question.
How To Filter Pandas Dataframe,
Source: https://stackoverflow.com/questions/13167391/filtering-grouped-dataframe-in-pandas
Posted by: branchligival.blogspot.com
give us a physical example, and show what y'all have tried.
Oct 31, 2012 at 21:08
Hopefully some help:
grouped.size().apply(lambda ten: 10>1)
, just I'chiliad not sure how to do thisOct 31, 2012 at 21:44
this is interesting ..for a change I have hit a area where a feature needed by me is missing in Pandas ..for long it was my understanding of it that was missing ..keen library for what I do..
October 31, 2012 at 21:51