Written by: Marlon Colca
Posted on 14 May 2025 - 5 months ago
python pandas analytics
Learn how to group, aggregate, and pivot your data using Pandas. Master `groupby`, `agg`, and `pivot_table` with real examples.
In this post, we’ll explore how to group and summarize your data with Pandas using methods like:
groupby()
agg()
pivot_table()
These tools help you answer real questions like:
We’ll use the same dataset (prices_sample.csv
):
import pandas as pd
df = pd.read_csv("prices_sample.csv")
groupby()
+ aggregationLet’s start simple: what’s the average price per category?
avg_price = df.groupby("category")["price"].mean()
print(avg_price)
You can also get multiple aggregations:
summary = df.groupby("category")["price"].agg(["mean", "min", "max", "count"])
print(summary)
agg()
You can even pass custom functions or lambdas:
df.groupby("brand")["price"].agg(
average="mean",
highest="max",
products="count"
)
You can rename the output columns with named aggregation
like above (average=...
).
This is a very common task:
product_counts = df["category"].value_counts()
print(product_counts)
Or using groupby:
df.groupby("category")["product_id"].count()
pivot_table()
Pivot tables give you more flexibility for reshaping data. Let’s say we want the average price per category and availability:
pivot = pd.pivot_table(df, values="price", index="category", columns="in_stock", aggfunc="mean")
print(pivot)
This shows how price changes depending on whether items are in stock or not.
.unstack()
and .melt()
You can convert groupby results into wide format with .unstack()
:
group = df.groupby(["category", "in_stock"])["price"].mean()
print(group.unstack())
Or go from wide → long format using .melt()
:
melted = df.melt(id_vars=["category"], value_vars=["price"])
print(melted.head())
Task | Use |
---|---|
Simple summaries | groupby() |
Complex reshaping | pivot_table() |
Multi-level aggregations | agg() |
Format manipulation | melt() , unstack() |
Grouping and aggregating are essential tools in your data toolkit.
Remember:
groupby()
for grouping rowsagg()
for multiple summariespivot_table()
for reshaping with aggregationsvalue_counts()
for quick category countsIn Part 6, we’ll work with dates and time in Pandas: parsing, grouping by month/week, and time series tricks ⏳.
Perfect if you want to build charts over time or analyze trends.
See you there!
Learn how to use groupby and aggregation in Pandas to analyze product sales data efficiently.
17 May 2025 - 5 months ago