Common Table Expressions (CTEs)
Write cleaner queries using WITH and CTEs.
Common Table Expressions (CTEs)
Common Table Expressions (CTEs) are one of the most powerful features in modern SQL for writing clean, readable, and maintainable queries. They allow you to define temporary named result sets that exist only during the execution of a query.
Think of CTEs as temporary "scratch pads" where you can store intermediate results and reference them multiple times in your main query. They're especially useful for breaking down complex queries into logical, easy-to-understand steps.
Why Use CTEs?
- Readability: Break complex queries into named, logical steps
- Reusability: Reference the same result set multiple times in one query
- Recursion: Build hierarchical queries (organizational charts, file systems, etc.)
- Maintainability: Easier to debug and modify than deeply nested subqueries
- Performance: In some cases, better than correlated subqueries
CTEs have become a standard in modern SQL development and are supported by all major databases (MySQL 8.0+, PostgreSQL, SQL Server, Oracle, SQLite 3.8+).
Basic CTE Syntax
The basic syntax for a CTE uses the WITH keyword followed by the CTE name, column names (optional), and the query definition:
Here's a simple example using an employees database:
-- Basic CTE: Calculate average salary by department
WITH dept_avg_salary AS (
SELECT
department,
AVG(salary) AS avg_salary
FROM employees
GROUP BY department
)
SELECT
e.employee_id,
e.name,
e.department,
e.salary,
d.avg_salary,
e.salary - d.avg_salary AS difference_from_avg
FROM employees e
JOIN dept_avg_salary d ON e.department = d.department
ORDER BY e.department, e.name;In this example, the CTE dept_avg_salary calculates the average salary for each department, then we join it with the employees table to show how each employee's salary compares to their department average.
Key Points:
- The CTE is defined once with
WITH cte_name AS (...) - The CTE query runs first, creating a temporary result set
- You can then reference the CTE name just like a table
- The CTE only exists for the duration of the query
Multiple CTEs
One of the most powerful features of CTEs is the ability to chain multiple CTEs together. Each CTE can reference CTEs defined before it, allowing you to build complex logic step by step.
-- Multiple CTEs: Analyze employee performance and compensation
WITH high_performers AS (
SELECT
employee_id,
name,
department,
salary,
performance_score
FROM employees
WHERE performance_score >= 4.0
),
dept_stats AS (
SELECT
department,
COUNT(*) AS total_employees,
COUNT(CASE WHEN performance_score >= 4.0 THEN 1 END) AS high_performers_count,
AVG(salary) AS avg_salary
FROM employees
GROUP BY department
),
performance_analysis AS (
SELECT
d.department,
d.total_employees,
d.high_performers_count,
ROUND(100.0 * d.high_performers_count / d.total_employees, 1) AS high_performer_percentage,
d.avg_salary
FROM dept_stats d
)
SELECT
h.name,
h.department,
h.salary,
h.performance_score,
p.avg_salary AS dept_avg_salary,
ROUND(100.0 * (h.salary - p.avg_salary) / p.avg_salary, 1) AS salary_vs_avg_pct,
p.high_performer_percentage AS dept_high_performer_pct
FROM high_performers h
JOIN performance_analysis p ON h.department = p.department
ORDER BY h.performance_score DESC, h.salary DESC;This example demonstrates three chained CTEs:
- high_performers: Filters employees with high performance scores
- dept_stats: Calculates statistics for each department
- performance_analysis: Computes performance percentages
Each CTE builds on the previous work, making the final query much easier to understand than a deeply nested subquery would be.
CTEs with JOINs
CTEs can reference multiple tables and perform complex joins, making them perfect for simplifying multi-table queries.
-- CTE with joins: Find customers who haven't purchased in 90 days
WITH recent_purchases AS (
SELECT
o.customer_id,
c.customer_name,
c.email,
MAX(o.order_date) AS last_order_date,
COUNT(o.order_id) AS total_orders,
SUM(oi.quantity * oi.price) AS lifetime_value
FROM orders o
JOIN customers c ON o.customer_id = c.customer_id
JOIN order_items oi ON o.order_id = oi.order_id
GROUP BY o.customer_id, c.customer_name, c.email
),
inactive_customers AS (
SELECT
customer_id,
customer_name,
email,
last_order_date,
DATEDIFF(CURRENT_DATE, last_order_date) AS days_since_order,
total_orders,
lifetime_value
FROM recent_purchases
WHERE DATEDIFF(CURRENT_DATE, last_order_date) > 90
)
SELECT
customer_name,
email,
last_order_date,
days_since_order,
total_orders,
ROUND(lifetime_value, 2) AS lifetime_value
FROM inactive_customers
WHERE lifetime_value > 500
ORDER BY lifetime_value DESC, days_since_order DESC;This query identifies valuable customers who haven't ordered recently - perfect for a re-engagement campaign. The CTEs break down the logic:
- recent_purchases: Joins orders, customers, and order items to calculate metrics
- inactive_customers: Filters to customers inactive for 90+ days
- Final SELECT: Shows only high-value customers worth re-engaging
Reusing CTEs Multiple Times
Unlike subqueries, you can reference a CTE multiple times in the same query without re-executing it. This can improve both readability and performance.
-- Reuse CTE: Compare current month vs previous month sales
WITH monthly_sales AS (
SELECT
DATE_FORMAT(order_date, '%Y-%m') AS month,
SUM(total_amount) AS total_sales,
COUNT(DISTINCT customer_id) AS unique_customers,
COUNT(order_id) AS total_orders
FROM orders
WHERE order_date >= DATE_SUB(CURRENT_DATE, INTERVAL 2 MONTH)
GROUP BY DATE_FORMAT(order_date, '%Y-%m')
)
SELECT
current.month AS current_month,
current.total_sales AS current_sales,
current.unique_customers AS current_customers,
previous.total_sales AS previous_sales,
previous.unique_customers AS previous_customers,
ROUND(current.total_sales - previous.total_sales, 2) AS sales_change,
ROUND(100.0 * (current.total_sales - previous.total_sales) / previous.total_sales, 1) AS sales_change_pct,
current.unique_customers - previous.unique_customers AS customer_change
FROM monthly_sales current
JOIN monthly_sales previous
ON DATE_FORMAT(DATE_SUB(STR_TO_DATE(CONCAT(current.month, '-01'), '%Y-%m-%d'), INTERVAL 1 MONTH), '%Y-%m') = previous.month
ORDER BY current.month DESC
LIMIT 1;Notice how we reference monthly_sales twice - once as current and once as previous - without having to write the aggregation logic twice. This is a huge advantage over subqueries.
Recursive CTEs
Recursive CTEs are a special type that can reference themselves, making them perfect for hierarchical or tree-structured data like organizational charts, file systems, or category hierarchies.
A recursive CTE has two parts:
- Base case (anchor member): The starting point that doesn't reference the CTE
- Recursive case (recursive member): References the CTE to build on previous results
These are combined with UNION ALL.
-- Recursive CTE: Show organizational hierarchy with levels
WITH RECURSIVE org_hierarchy AS (
-- Base case: Start with CEO (no manager)
SELECT
employee_id,
name,
title,
manager_id,
1 AS level,
CAST(name AS CHAR(500)) AS hierarchy_path
FROM employees
WHERE manager_id IS NULL
UNION ALL
-- Recursive case: Find employees reporting to previous level
SELECT
e.employee_id,
e.name,
e.title,
e.manager_id,
oh.level + 1,
CONCAT(oh.hierarchy_path, ' > ', e.name)
FROM employees e
JOIN org_hierarchy oh ON e.manager_id = oh.employee_id
)
SELECT
level,
employee_id,
name,
title,
hierarchy_path
FROM org_hierarchy
ORDER BY level, name;This recursive CTE walks down the organizational hierarchy:
- Base case: Finds the CEO (employee with no manager)
- Recursive case: Finds employees who report to anyone already in the result set
- Result: Complete organizational tree with levels and paths
How it works:
- Start with level 1 (CEO)
- Find all employees reporting to level 1 → level 2
- Find all employees reporting to level 2 → level 3
- Continue until no more employees are found
Here's another practical example - calculating running totals:
-- Recursive CTE: Generate date series and calculate running totals
WITH RECURSIVE date_series AS (
-- Base case: Start date
SELECT
DATE('2026-01-01') AS date,
1 AS day_number
UNION ALL
-- Recursive case: Add one day at a time
SELECT
DATE_ADD(date, INTERVAL 1 DAY),
day_number + 1
FROM date_series
WHERE day_number < 7
)
SELECT
d.date,
d.day_number,
COALESCE(o.daily_revenue, 0) AS daily_revenue,
SUM(COALESCE(o.daily_revenue, 0)) OVER (ORDER BY d.date) AS running_total
FROM date_series d
LEFT JOIN (
SELECT
DATE(order_date) AS order_date,
SUM(total_amount) AS daily_revenue
FROM orders
WHERE order_date >= '2026-01-01' AND order_date < '2026-01-08'
GROUP BY DATE(order_date)
) o ON d.date = o.order_date
ORDER BY d.date;This example uses a recursive CTE to generate a complete date series, ensuring we have rows even for days with no sales. This is useful for reports that need to show gaps in data.
CTEs vs Subqueries vs Views
When should you use a CTE instead of a subquery or view? Here's a comparison:
Feature | CTE | Subquery | View |
|---|---|---|---|
| Reusability in query | Can reference multiple times | Each subquery re-executes | Can reference multiple times |
| Readability | Named, at top of query | Nested, harder to read | Named, defined separately |
| Persistence | Query duration only | Query duration only | Permanent database object |
| Recursion | Supported | Not supported | Not directly supported |
| Scope | Single query | Single query | Entire database |
| Performance | Generally good | May re-execute | Cached plan |
Use CTEs when:
- You need to reference the same result set multiple times in one query
- You're breaking down a complex query for better readability
- You need recursion for hierarchical data
- You want temporary organization without creating database objects
Use subqueries when:
- You only need the result once in a simple context
- The subquery is very simple (e.g., single value in WHERE clause)
- You prefer everything inline
Use views when:
- Multiple queries across your application need the same logic
- You want to encapsulate security/permissions
- You need the result set to persist beyond one query
- You want to abstract complexity for other developers
Real-World Example: Sales Analysis Pipeline
Let's put it all together with a comprehensive sales analysis that uses multiple CTEs to build a complete picture:
-- Complete sales analysis with multiple CTEs
WITH customer_segments AS (
-- Segment customers by purchase frequency
SELECT
customer_id,
COUNT(order_id) AS total_orders,
SUM(total_amount) AS lifetime_value,
CASE
WHEN COUNT(order_id) >= 10 THEN 'VIP'
WHEN COUNT(order_id) >= 5 THEN 'Regular'
ELSE 'Occasional'
END AS segment
FROM orders
WHERE order_date >= DATE_SUB(CURRENT_DATE, INTERVAL 1 YEAR)
GROUP BY customer_id
),
product_performance AS (
-- Calculate product metrics
SELECT
p.product_id,
p.product_name,
p.category,
COUNT(DISTINCT oi.order_id) AS times_ordered,
SUM(oi.quantity) AS units_sold,
SUM(oi.quantity * oi.price) AS total_revenue
FROM products p
JOIN order_items oi ON p.product_id = oi.product_id
JOIN orders o ON oi.order_id = o.order_id
WHERE o.order_date >= DATE_SUB(CURRENT_DATE, INTERVAL 1 YEAR)
GROUP BY p.product_id, p.product_name, p.category
),
segment_preferences AS (
-- Find what each segment buys
SELECT
cs.segment,
pp.category,
COUNT(DISTINCT o.order_id) AS orders_count,
SUM(oi.quantity * oi.price) AS revenue,
ROUND(AVG(oi.price), 2) AS avg_price
FROM customer_segments cs
JOIN orders o ON cs.customer_id = o.customer_id
JOIN order_items oi ON o.order_id = oi.order_id
JOIN product_performance pp ON oi.product_id = pp.product_id
GROUP BY cs.segment, pp.category
),
segment_summary AS (
-- Summarize by segment
SELECT
segment,
SUM(revenue) AS total_segment_revenue
FROM segment_preferences
GROUP BY segment
)
SELECT
sp.segment,
sp.category,
sp.orders_count,
ROUND(sp.revenue, 2) AS category_revenue,
ROUND(100.0 * sp.revenue / ss.total_segment_revenue, 1) AS pct_of_segment,
sp.avg_price
FROM segment_preferences sp
JOIN segment_summary ss ON sp.segment = ss.segment
WHERE sp.revenue > 1000 -- Only show significant categories
ORDER BY sp.segment, sp.revenue DESC;This analysis pipeline shows the power of CTEs:
- customer_segments: Classifies customers by purchase behavior
- product_performance: Calculates metrics for each product
- segment_preferences: Combines customer and product data
- segment_summary: Calculates totals for percentage calculations
- Final query: Presents insights about what each segment prefers
Without CTEs, this would be a nightmare of nested subqueries. With CTEs, each step is clear and testable.
Common Mistakes
1. Using WITH multiple times for multiple CTEs
❌ Wrong:
WITH cte1 AS (SELECT ...)
WITH cte2 AS (SELECT ...) -- Error!
SELECT * FROM cte1, cte2;
✅ Correct:
WITH cte1 AS (SELECT ...),
cte2 AS (SELECT ...)
SELECT * FROM cte1, cte2;
2. Forgetting RECURSIVE keyword for recursive CTEs
❌ Wrong:
WITH org_chart AS (
SELECT ... WHERE manager_id IS NULL
UNION ALL
SELECT ... JOIN org_chart ... -- Error!
)
✅ Correct:
WITH RECURSIVE org_chart AS (
SELECT ... WHERE manager_id IS NULL
UNION ALL
SELECT ... JOIN org_chart ...
)
3. Creating infinite recursion
❌ Wrong:
WITH RECURSIVE numbers AS (
SELECT 1 AS n
UNION ALL
SELECT n + 1 FROM numbers -- Never stops!
)
SELECT * FROM numbers;
✅ Correct:
WITH RECURSIVE numbers AS (
SELECT 1 AS n
UNION ALL
SELECT n + 1 FROM numbers WHERE n < 100 -- Termination condition
)
SELECT * FROM numbers;
4. Using UNION instead of UNION ALL in recursive CTEs
Recursive CTEs require UNION ALL, not UNION. Using UNION will cause errors or unexpected behavior.
5. Trying to reference a CTE from another query
CTEs only exist for the duration of a single query. You can't define a CTE and then use it in a separate query.
Key Takeaways
- CTEs use the WITH clause to define temporary named result sets that exist for one query
- Multiple CTEs are separated by commas and can reference previous CTEs
- CTEs improve readability by breaking complex queries into logical, named steps
- CTEs can be reused multiple times in the same query without re-execution
- Recursive CTEs use
WITH RECURSIVEand include a base case + recursive case withUNION ALL - Recursive CTEs are perfect for hierarchies like org charts, file systems, or category trees
- Always include termination conditions in recursive CTEs to prevent infinite loops
- CTEs are better than subqueries for complex queries that need clarity
- CTEs are temporary and don't persist like views do
- Modern SQL development uses CTEs extensively for maintainable code
Practice Exercise
Using the following tables:
employees (employee_id, name, department, salary, hire_date, manager_id)
departments (department_id, department_name, budget)
projects (project_id, project_name, department_id, start_date, end_date)
Write a query using CTEs that:
- Calculates total salary cost per department
- Calculates budget utilization percentage (salary cost / budget)
- Lists all departments over 80% budget utilization
- For those departments, shows which employees were hired in the last year
- Includes the department's project count
Try to solve this using at least 3 CTEs, focusing on clarity and logical organization.
Bonus challenge: Use a recursive CTE to show the management hierarchy for employees in over-budget departments.