-->
Written by: Marlon Colca
Posted on 11 May 2025 - 4 months ago
python pandas analytics
Learn how to explore and understand your dataset using Pandas. From `.head()` to `.describe()` and `.value_counts()`, this post walks you through the essential tools.
Before you clean, transform, or model your data, you need to understand what you’re working with.
That’s where Exploratory Data Analysis (EDA) comes in — and Pandas makes it easy.
In this post, we’ll look at the basic tools you can use to inspect your dataset and start asking the right questions.
For this tutorial, let’s imagine you’ve loaded a CSV with product prices:
import pandas as pd
df = pd.read_csv("prices_sample.csv")
Let’s now explore it step by step 👇
.head()
, .tail()
, .shape
, .info()
These are your first tools when working with any dataset.
# First 5 rows
print(df.head())
# Last 5 rows
print(df.tail())
# Number of rows and columns
print(df.shape)
# Column types and nulls
print(df.info())
Use this to quickly understand:
.describe()
This one is a must. It gives you quick stats on all numerical columns:
print(df.describe())
You’ll get:
💡 Great for spotting outliers or weird values (e.g. negative prices?).
.value_counts()
For categorical columns, this method shows how often each value appears.
# Count of products per category
print(df["category"].value_counts())
You can also use it on booleans or binary flags, like availability or on_sale columns.
.isnull().sum()
Knowing where your missing data lives is essential.
# Total missing values per column
print(df.isnull().sum())
This helps you decide whether to:
fillna()
)dropna()
)Want to see how many different values a column has?
print(df["brand"].nunique())
print(df["brand"].unique())
Useful to spot typos, inconsistencies, or too many categories.
Here’s a quick EDA checklist you can use every time you load new data:
✅ df.head()
and df.tail()
✅ df.shape
and df.info()
✅ df.describe()
✅ df.isnull().sum()
✅ value_counts()
on key categorical columns
✅ unique()
and nunique()
for quick validation
Now that we understand the shape of our data, it’s time to clean it up.
In the next entry, we’ll dive into fixing missing values, renaming columns, fixing data types, and more.
See you in Part 3! 🧼
Learn how to clean messy data using Pandas. We'll fix missing values, rename columns, convert data types, and prepare our dataset for analysis.