Self Joins

Join a table to itself to find relationships within the same table.

16 min read
Beginner

Introduction to Self Joins

In previous lessons, you learned to join different tables together (orders with customers, products with categories). But what if you need to find relationships within the same table?

For example:

  • Employees table: Find who reports to whom (employee-manager relationships)
  • Products table: Find products in the same category
  • Users table: Find user referral chains
  • Routes table: Connect flight segments to build complete journeys

This is where self joins come in - joining a table to itself!

A join where a table is joined with itself to find relationships between rows in the same table. You use different aliases to treat the same table as two separate tables in the query.

Why use self joins?

  • Find hierarchical relationships (employees and their managers)
  • Compare rows within the same table
  • Find patterns and connections in your data
  • Build recursive relationships

The key concept: Use table aliases to reference the same table twice with different names.

Employee-Manager Relationships

The most common example is an employee hierarchy. Let's work with an employees table where each employee has a manager (who is also an employee):

employees
id
name
position
manager_id
salary
1Sarah ChenCEONULL250000
2John DavisVP Engineering1180000
3Alice SmithEngineer295000
4Bob JohnsonEngineer290000
5Carol WhiteVP Sales1170000
6David BrownSales Rep565000

Notice:

  • manager_id references id in the same table
  • Sarah (CEO) has no manager (NULL)
  • John and Carol report to Sarah
  • Alice and Bob report to John
  • David reports to Carol

How do we show each employee with their manager's name? We need to join employees to itself!

Basic Self Join Syntax

Syntax:

SELECT columns
FROM table AS alias1
INNER JOIN table AS alias2
  ON alias1.column = alias2.column;

Key points:

  1. Use the same table twice with different aliases
  2. alias1 represents one "copy" of the table (employees)
  3. alias2 represents another "copy" of the table (managers)
  4. Join them on the relationship column (employee.manager_id = manager.id)

Let's see it in action:

sql
-- Show each employee with their manager's name
SELECT
  e.name AS employee,
  e.position AS employee_position,
  m.name AS manager,
  m.position AS manager_position
FROM employees AS e
INNER JOIN employees AS m
  ON e.manager_id = m.id;

What happened:

  1. e alias = employees (all employees)
  2. m alias = managers (the same table, but we're treating it as managers)
  3. For each employee, we look up their manager by matching e.manager_id = m.id
  4. We select columns from both "copies": e.name (employee) and m.name (manager)

Why is Sarah missing? She has manager_id = NULL (no manager), so there's no match in the INNER JOIN. We'll fix this later with LEFT JOIN!

Understanding the Aliases

The aliases let you treat one table as two separate entities:

Think of it like this:

employees AS e          employees AS m
(all employees)         (all managers)
-----------------       -----------------
1  Sarah  NULL          1  Sarah  NULL
2  John   1      →      2  John   1
3  Alice  2      →      3  Alice  2
4  Bob    2      →      4  Bob    2
5  Carol  1      →      5  Carol  1
6  David  5      →      6  David  5

Join condition: e.manager_id = m.id
- John (e) has manager_id=1 → matches Sarah (m) with id=1
- Alice (e) has manager_id=2 → matches John (m) with id=2
- Bob (e) has manager_id=2 → matches John (m) with id=2

Without aliases, you couldn't distinguish between "employee name" and "manager name" - they're both from the same name column!

Including Top-Level Employees with LEFT JOIN

To include employees with no manager (like Sarah), use a LEFT JOIN instead:

sql
-- Include employees even if they have no manager
SELECT
  e.name AS employee,
  e.position,
  COALESCE(m.name, 'No Manager') AS manager
FROM employees AS e
LEFT JOIN employees AS m
  ON e.manager_id = m.id
ORDER BY e.id;

Now Sarah appears! LEFT JOIN includes all rows from the left table (e) even when there's no match in the right table (m).

We used COALESCE(m.name, 'No Manager') to replace NULL with a readable message.

Finding Employees at the Same Level

Another use case: find employees who share the same manager (peers/colleagues):

sql
-- Find pairs of employees with the same manager
SELECT
  e1.name AS employee1,
  e2.name AS employee2,
  m.name AS shared_manager
FROM employees AS e1
INNER JOIN employees AS e2
  ON e1.manager_id = e2.manager_id
  AND e1.id < e2.id  -- Avoid duplicates and self-pairs
INNER JOIN employees AS m
  ON e1.manager_id = m.id;

What's happening:

  1. Join employees to itself (e1 and e2)
  2. Match rows where both have the same manager_id
  3. e1.id < e2.id prevents:
    • Self-pairing (Alice with Alice)
    • Duplicates (Alice-Bob and Bob-Alice)
  4. Join to m to get the manager's name

Result: Pairs of employees who are peers!

  • John and Carol both report to Sarah
  • Alice and Bob both report to John

Comparing Salaries with Self Join

Self joins are great for comparisons within a table:

sql
-- Find employees who earn more than their manager
SELECT
  e.name AS employee,
  e.salary AS employee_salary,
  m.name AS manager,
  m.salary AS manager_salary,
  e.salary - m.salary AS salary_difference
FROM employees AS e
INNER JOIN employees AS m
  ON e.manager_id = m.id
WHERE e.salary > m.salary;

Good news - no employees earn more than their managers in this dataset!

Example: Find employees earning within $10k of each other

sql
SELECT
  e1.name AS employee1,
  e1.salary AS salary1,
  e2.name AS employee2,
  e2.salary AS salary2,
  ABS(e1.salary - e2.salary) AS salary_diff
FROM employees AS e1
INNER JOIN employees AS e2
  ON e1.id < e2.id
  AND ABS(e1.salary - e2.salary) <= 10000
ORDER BY salary_diff;

Alice and Bob have salaries within $5,000 of each other!

Multi-Level Hierarchies

You can join a table to itself multiple times to traverse multiple levels:

sql
-- Show employee, their manager, and their manager's manager
SELECT
  e.name AS employee,
  e.position AS emp_position,
  m1.name AS direct_manager,
  m2.name AS senior_manager
FROM employees AS e
LEFT JOIN employees AS m1
  ON e.manager_id = m1.id
LEFT JOIN employees AS m2
  ON m1.manager_id = m2.id
WHERE e.position = 'Engineer';

Chain of command:

  • Alice/Bob → report to John → who reports to Sarah

This works by joining employees three times:

  1. e = employees
  2. m1 = their direct managers
  3. m2 = their managers' managers

Practical Example: Flight Connections

Let's look at a different scenario - connecting flights:

flights
flight_id
origin
destination
departure_time
price
F101NYCCHI08:00150
F102CHILAX12:00200
F103NYCDEN09:00180
F104DENLAX14:00190
F105LAXSEA16:00120
sql
-- Find two-flight connections from NYC to LAX
SELECT
  f1.flight_id AS first_flight,
  f1.origin AS start_city,
  f1.destination AS connection_city,
  f1.price AS first_price,
  f2.flight_id AS second_flight,
  f2.destination AS final_city,
  f2.price AS second_price,
  f1.price + f2.price AS total_price
FROM flights AS f1
INNER JOIN flights AS f2
  ON f1.destination = f2.origin
WHERE f1.origin = 'NYC'
  AND f2.destination = 'LAX';

Two routes found:

  1. NYC → CHI → LAX ($350 total)
  2. NYC → DEN → LAX ($370 total)

The self join matches where the first flight's destination equals the second flight's origin!

Common Mistakes to Avoid

Don't make these mistakes:

  1. Forgetting aliases: ❌

    • SELECT name FROM employees JOIN employees...
    • Which table's name? You must use: e.name and m.name
  2. Using the same alias twice: ❌

    • FROM employees e JOIN employees e...
    • Must use different aliases: employees e and employees m
  3. Wrong join condition: ❌

    • ON e.id = m.id (joins each row to itself!)
    • Correct: ON e.manager_id = m.id
  4. Getting duplicates in comparisons: ❌

    • Without e1.id < e2.id, you get both Alice-Bob and Bob-Alice
    • And Alice-Alice, Bob-Bob (self-pairs)
  5. Forgetting about NULLs: ❌

    • INNER JOIN excludes rows with NULL in the join column
    • Use LEFT JOIN to include top-level records
  6. Cartesian explosion: ⚠️

    • Without proper conditions, self joins can create massive result sets
    • Always have specific WHERE/ON conditions
  7. Confusing column names: ❌

    • Use clear aliases: employee, manager, employee1, employee2
    • Don't just select e.name, m.name without renaming

More Practical Examples

Example: Product recommendations (similar products)

products
id
product_name
category
price
1Laptop ProElectronics1200
2Laptop AirElectronics999
3Desktop PCElectronics800
4Office ChairFurniture350
sql
-- Find products in the same category (for recommendations)
SELECT
  p1.product_name AS product,
  p2.product_name AS similar_product,
  p1.category,
  ABS(p1.price - p2.price) AS price_difference
FROM products AS p1
INNER JOIN products AS p2
  ON p1.category = p2.category
  AND p1.id < p2.id
WHERE p1.category = 'Electronics'
ORDER BY price_difference;

Use case: "Customers who viewed Laptop Air also liked: Laptop Pro ($201 more)"

Example: Find employees hired in the same year

employees_extended
id
name
hire_date
1Alice2022-03-15
2Bob2022-07-20
3Charlie2023-01-10
sql
SELECT
  e1.name AS employee1,
  e2.name AS employee2,
  YEAR(e1.hire_date) AS hire_year
FROM employees_extended AS e1
INNER JOIN employees_extended AS e2
  ON YEAR(e1.hire_date) = YEAR(e2.hire_date)
  AND e1.id < e2.id;

Alice and Bob were hired in the same year (2022) and might have gone through onboarding together!

Key Takeaways

What you learned:

Self join = joining a table to itself to find relationships within the same table ✅ Use different aliases (e and m, e1 and e2) to reference the table twice ✅ Common use case: hierarchies (employees → managers) ✅ Also useful for: comparisons, finding pairs, connectionsINNER JOIN excludes NULLs; use LEFT JOIN to include top-level records ✅ Use alias1.id < alias2.id to avoid duplicates and self-pairs ✅ Can join a table to itself multiple times for multi-level traversal ✅ Always use clear column aliases (AS employee, AS manager) for readability

Real-world applications:

  • Organization charts (employee hierarchies)
  • Social networks (friend-of-friend relationships)
  • Transportation (connecting routes)
  • E-commerce (product recommendations)
  • Family trees (parent-child relationships)

Practice Exercise: Try these self joins:

  1. Using the employees table, find all employees who earn more than Alice
  2. Find pairs of products with prices within $100 of each other
  3. Show each employee with their manager's salary
  4. Find all possible 2-stop flight routes from NYC to SEA
  5. List employees hired within 30 days of each other
  6. Create a 3-level hierarchy: employee → manager → senior manager → executive