. Advertisement .
..3..
. Advertisement .
..4..
We will go through how to utilize pandas groupby() and count() aggregate in this post with in depth examples. So, wait no longer but scroll down for further helpful information!
What Is Pandas Groupby?
A groupby procedure includes dividing up a lot of data into groups and doing calculations on these groups. In most cases, it involves dividing the item, using a function, and combining the outcomes.
Let’s look at the syntax on using pandas to retrieve the count of the last value in the group in this post as follows:
DataFrame.groupby(by, axis, as_index)
As such, we have:
- by (tuples, datatype-list, dict, array, series) involves list of labels, function, mapping, and label. The groups are established using the supplied function exactly as is.
- axis (default 0, datatype int): 1 divides columns, whereas 0 divides rows.
- as_index (datatype bool, default True.) Returns an object for all aggregated output with group labels as the index.
Simple Dataframe Groupby() And Count() Examples
If you are in need of grouping by columns and retrieving the count for each group from a DataFrame quickly, have a look at the illustrations below.
# Using groupby() and count()
df2 = df.groupby(['Teachers'])['Teachers'].count()
# Using GroupBy & count() on multiple column
df2 = df.groupby(['Teachers','Prolongation'])['Fee'].count()
Let’s now construct a DataFrame with a few rows and columns by running these examples, and then check the outcomes in the end. The columns in our DataFrame will be named as Teachers, Fee, Prolongation, and Discount.
# Create a pandas DataFrame.
import pandas as pd
englishteaching = ({
'Teachers':["Simmon","Pinmark","Hanson","Putin","Plex","Hanson","Simmon","Putin"],
'Fee' :[32000,45000,33000,44000,36000,45000,35000,42000],
'Prolongation':['50days','30days','40days','35days','35days','60days','50days','55days'],
'Discount':[2000,1300,2000,1100,2500,2300,1200,1500]
})
df = pd.DataFrame(englishteaching, columns=['Teachers','Fee','Prolongation','Discount'])
print(df)
That way, we are achieve the below output:
Teachers Fee Prolongation Discount
0 Simmon 32000 50days 2000
1 Pinmark 45000 30days 1300
2 Hanson 33000 40days 2000
3 Putin 44000 35days 1100
4 Plex 36000 35days 2500
5 Hanson 45000 60days 2300
6 Simmon 35000 50days 1200
7 Putin 42000 55days 1500
Employing count() by Column Name
Pandas DataFrame.groupby() is a truly great deal in grouping the rows by column and employing the count() method to retrieve the count for each group by disregarding None and Nan values.
Better yet, it even functions with non-floating type data. The example below groups the values by the Teachers column and counts the number of times each value is present.
# Using groupby() and count()
df2 = df.groupby(['Teachers'])['Teachers'].count()
pint(df2)
The output returned will be:
Teachers
Hanson 2
Plex 1
Pinmark 1
Putin 2
Simmon 2
Name: Teachers, dtype: int64
Pandas Groupby() And Count() On List Of Columns
In order to apply groupby to several columns and determine a count over each combination group, you can also pass a list of the columns you wish to group to the groupby() method.
For instance, the function df.groupby(['Teachers',''Prolongation'])['Fee'].count()
groups by the ‘Teachers’ and ‘Prolongation’ columns before determining the count.
# Using groupby() & count() on multiple column
df2 = df.groupby(['Teachers','Prolongation'])['Fee'].count()
print(df2)
The output returned will be:
Teachers Prolongation
Hanson 40days 1
60days 1
Plex 35days 1
Pinmark 30days 1
Putin 45days 1
55days 1
Simmon 50days 2
Name: Fee, dtype: int64
Conclusion
Above is all the fundamental insight regarding how to operate pandas groupby() and count(). Hopefully, this article can be of great help to you somehow. See then!
Leave a comment