Common Table Expressions (CTEs)

Write cleaner queries using WITH and CTEs.

16 min read
Beginner

Common Table Expressions (CTEs)

Common Table Expressions (CTEs) are one of the most powerful features in modern SQL for writing clean, readable, and maintainable queries. They allow you to define temporary named result sets that exist only during the execution of a query.

Think of CTEs as temporary "scratch pads" where you can store intermediate results and reference them multiple times in your main query. They're especially useful for breaking down complex queries into logical, easy-to-understand steps.

Why Use CTEs?

  • Readability: Break complex queries into named, logical steps
  • Reusability: Reference the same result set multiple times in one query
  • Recursion: Build hierarchical queries (organizational charts, file systems, etc.)
  • Maintainability: Easier to debug and modify than deeply nested subqueries
  • Performance: In some cases, better than correlated subqueries

CTEs have become a standard in modern SQL development and are supported by all major databases (MySQL 8.0+, PostgreSQL, SQL Server, Oracle, SQLite 3.8+).

CTEs were introduced in MySQL 8.0. If you're using MySQL 5.7 or earlier, you'll need to use subqueries or temporary tables instead.

Basic CTE Syntax

The basic syntax for a CTE uses the WITH keyword followed by the CTE name, column names (optional), and the query definition:

A temporary named result set that you can reference within a SELECT, INSERT, UPDATE, or DELETE statement. Defined using the WITH clause.

Here's a simple example using an employees database:

sql
-- Basic CTE: Calculate average salary by department
WITH dept_avg_salary AS (
  SELECT
    department,
    AVG(salary) AS avg_salary
  FROM employees
  GROUP BY department
)
SELECT
  e.employee_id,
  e.name,
  e.department,
  e.salary,
  d.avg_salary,
  e.salary - d.avg_salary AS difference_from_avg
FROM employees e
JOIN dept_avg_salary d ON e.department = d.department
ORDER BY e.department, e.name;

In this example, the CTE dept_avg_salary calculates the average salary for each department, then we join it with the employees table to show how each employee's salary compares to their department average.

Key Points:

  • The CTE is defined once with WITH cte_name AS (...)
  • The CTE query runs first, creating a temporary result set
  • You can then reference the CTE name just like a table
  • The CTE only exists for the duration of the query
Give your CTEs descriptive names that explain what data they contain. Names like `high_performers`, `monthly_totals`, or `active_customers` are much better than `cte1` or `temp`.

Multiple CTEs

One of the most powerful features of CTEs is the ability to chain multiple CTEs together. Each CTE can reference CTEs defined before it, allowing you to build complex logic step by step.

sql
-- Multiple CTEs: Analyze employee performance and compensation
WITH high_performers AS (
  SELECT
    employee_id,
    name,
    department,
    salary,
    performance_score
  FROM employees
  WHERE performance_score >= 4.0
),
dept_stats AS (
  SELECT
    department,
    COUNT(*) AS total_employees,
    COUNT(CASE WHEN performance_score >= 4.0 THEN 1 END) AS high_performers_count,
    AVG(salary) AS avg_salary
  FROM employees
  GROUP BY department
),
performance_analysis AS (
  SELECT
    d.department,
    d.total_employees,
    d.high_performers_count,
    ROUND(100.0 * d.high_performers_count / d.total_employees, 1) AS high_performer_percentage,
    d.avg_salary
  FROM dept_stats d
)
SELECT
  h.name,
  h.department,
  h.salary,
  h.performance_score,
  p.avg_salary AS dept_avg_salary,
  ROUND(100.0 * (h.salary - p.avg_salary) / p.avg_salary, 1) AS salary_vs_avg_pct,
  p.high_performer_percentage AS dept_high_performer_pct
FROM high_performers h
JOIN performance_analysis p ON h.department = p.department
ORDER BY h.performance_score DESC, h.salary DESC;

This example demonstrates three chained CTEs:

  1. high_performers: Filters employees with high performance scores
  2. dept_stats: Calculates statistics for each department
  3. performance_analysis: Computes performance percentages

Each CTE builds on the previous work, making the final query much easier to understand than a deeply nested subquery would be.

When using multiple CTEs, separate them with commas, NOT additional WITH keywords. The syntax is: `WITH cte1 AS (...), cte2 AS (...), cte3 AS (...)`

CTEs with JOINs

CTEs can reference multiple tables and perform complex joins, making them perfect for simplifying multi-table queries.

sql
-- CTE with joins: Find customers who haven't purchased in 90 days
WITH recent_purchases AS (
  SELECT
    o.customer_id,
    c.customer_name,
    c.email,
    MAX(o.order_date) AS last_order_date,
    COUNT(o.order_id) AS total_orders,
    SUM(oi.quantity * oi.price) AS lifetime_value
  FROM orders o
  JOIN customers c ON o.customer_id = c.customer_id
  JOIN order_items oi ON o.order_id = oi.order_id
  GROUP BY o.customer_id, c.customer_name, c.email
),
inactive_customers AS (
  SELECT
    customer_id,
    customer_name,
    email,
    last_order_date,
    DATEDIFF(CURRENT_DATE, last_order_date) AS days_since_order,
    total_orders,
    lifetime_value
  FROM recent_purchases
  WHERE DATEDIFF(CURRENT_DATE, last_order_date) > 90
)
SELECT
  customer_name,
  email,
  last_order_date,
  days_since_order,
  total_orders,
  ROUND(lifetime_value, 2) AS lifetime_value
FROM inactive_customers
WHERE lifetime_value > 500
ORDER BY lifetime_value DESC, days_since_order DESC;

This query identifies valuable customers who haven't ordered recently - perfect for a re-engagement campaign. The CTEs break down the logic:

  1. recent_purchases: Joins orders, customers, and order items to calculate metrics
  2. inactive_customers: Filters to customers inactive for 90+ days
  3. Final SELECT: Shows only high-value customers worth re-engaging

Reusing CTEs Multiple Times

Unlike subqueries, you can reference a CTE multiple times in the same query without re-executing it. This can improve both readability and performance.

sql
-- Reuse CTE: Compare current month vs previous month sales
WITH monthly_sales AS (
  SELECT
    DATE_FORMAT(order_date, '%Y-%m') AS month,
    SUM(total_amount) AS total_sales,
    COUNT(DISTINCT customer_id) AS unique_customers,
    COUNT(order_id) AS total_orders
  FROM orders
  WHERE order_date >= DATE_SUB(CURRENT_DATE, INTERVAL 2 MONTH)
  GROUP BY DATE_FORMAT(order_date, '%Y-%m')
)
SELECT
  current.month AS current_month,
  current.total_sales AS current_sales,
  current.unique_customers AS current_customers,
  previous.total_sales AS previous_sales,
  previous.unique_customers AS previous_customers,
  ROUND(current.total_sales - previous.total_sales, 2) AS sales_change,
  ROUND(100.0 * (current.total_sales - previous.total_sales) / previous.total_sales, 1) AS sales_change_pct,
  current.unique_customers - previous.unique_customers AS customer_change
FROM monthly_sales current
JOIN monthly_sales previous
  ON DATE_FORMAT(DATE_SUB(STR_TO_DATE(CONCAT(current.month, '-01'), '%Y-%m-%d'), INTERVAL 1 MONTH), '%Y-%m') = previous.month
ORDER BY current.month DESC
LIMIT 1;

Notice how we reference monthly_sales twice - once as current and once as previous - without having to write the aggregation logic twice. This is a huge advantage over subqueries.

Recursive CTEs

Recursive CTEs are a special type that can reference themselves, making them perfect for hierarchical or tree-structured data like organizational charts, file systems, or category hierarchies.

A recursive CTE has two parts:

  1. Base case (anchor member): The starting point that doesn't reference the CTE
  2. Recursive case (recursive member): References the CTE to build on previous results

These are combined with UNION ALL.

A CTE that references itself to process hierarchical data. It consists of a base case (starting point) and a recursive case (iterative step), combined with UNION ALL.
sql
-- Recursive CTE: Show organizational hierarchy with levels
WITH RECURSIVE org_hierarchy AS (
  -- Base case: Start with CEO (no manager)
  SELECT
    employee_id,
    name,
    title,
    manager_id,
    1 AS level,
    CAST(name AS CHAR(500)) AS hierarchy_path
  FROM employees
  WHERE manager_id IS NULL

  UNION ALL

  -- Recursive case: Find employees reporting to previous level
  SELECT
    e.employee_id,
    e.name,
    e.title,
    e.manager_id,
    oh.level + 1,
    CONCAT(oh.hierarchy_path, ' > ', e.name)
  FROM employees e
  JOIN org_hierarchy oh ON e.manager_id = oh.employee_id
)
SELECT
  level,
  employee_id,
  name,
  title,
  hierarchy_path
FROM org_hierarchy
ORDER BY level, name;

This recursive CTE walks down the organizational hierarchy:

  1. Base case: Finds the CEO (employee with no manager)
  2. Recursive case: Finds employees who report to anyone already in the result set
  3. Result: Complete organizational tree with levels and paths

How it works:

  • Start with level 1 (CEO)
  • Find all employees reporting to level 1 → level 2
  • Find all employees reporting to level 2 → level 3
  • Continue until no more employees are found
Recursive CTEs can create infinite loops if not properly constrained. Always ensure your recursive case has a termination condition (like a NULL manager_id or a maximum depth). Some databases have a maximum recursion depth (default is often 100).

Here's another practical example - calculating running totals:

sql
-- Recursive CTE: Generate date series and calculate running totals
WITH RECURSIVE date_series AS (
  -- Base case: Start date
  SELECT
    DATE('2026-01-01') AS date,
    1 AS day_number

  UNION ALL

  -- Recursive case: Add one day at a time
  SELECT
    DATE_ADD(date, INTERVAL 1 DAY),
    day_number + 1
  FROM date_series
  WHERE day_number < 7
)
SELECT
  d.date,
  d.day_number,
  COALESCE(o.daily_revenue, 0) AS daily_revenue,
  SUM(COALESCE(o.daily_revenue, 0)) OVER (ORDER BY d.date) AS running_total
FROM date_series d
LEFT JOIN (
  SELECT
    DATE(order_date) AS order_date,
    SUM(total_amount) AS daily_revenue
  FROM orders
  WHERE order_date >= '2026-01-01' AND order_date < '2026-01-08'
  GROUP BY DATE(order_date)
) o ON d.date = o.order_date
ORDER BY d.date;

This example uses a recursive CTE to generate a complete date series, ensuring we have rows even for days with no sales. This is useful for reports that need to show gaps in data.

CTEs vs Subqueries vs Views

When should you use a CTE instead of a subquery or view? Here's a comparison:

Feature
CTE
Subquery
View
Reusability in queryCan reference multiple timesEach subquery re-executesCan reference multiple times
ReadabilityNamed, at top of queryNested, harder to readNamed, defined separately
PersistenceQuery duration onlyQuery duration onlyPermanent database object
RecursionSupportedNot supportedNot directly supported
ScopeSingle querySingle queryEntire database
PerformanceGenerally goodMay re-executeCached plan

Use CTEs when:

  • You need to reference the same result set multiple times in one query
  • You're breaking down a complex query for better readability
  • You need recursion for hierarchical data
  • You want temporary organization without creating database objects

Use subqueries when:

  • You only need the result once in a simple context
  • The subquery is very simple (e.g., single value in WHERE clause)
  • You prefer everything inline

Use views when:

  • Multiple queries across your application need the same logic
  • You want to encapsulate security/permissions
  • You need the result set to persist beyond one query
  • You want to abstract complexity for other developers
Start with CTEs when building complex queries. They make debugging much easier because you can test each CTE independently. You can always convert to a view later if the logic needs to be reused across multiple queries.

Real-World Example: Sales Analysis Pipeline

Let's put it all together with a comprehensive sales analysis that uses multiple CTEs to build a complete picture:

sql
-- Complete sales analysis with multiple CTEs
WITH customer_segments AS (
  -- Segment customers by purchase frequency
  SELECT
    customer_id,
    COUNT(order_id) AS total_orders,
    SUM(total_amount) AS lifetime_value,
    CASE
      WHEN COUNT(order_id) >= 10 THEN 'VIP'
      WHEN COUNT(order_id) >= 5 THEN 'Regular'
      ELSE 'Occasional'
    END AS segment
  FROM orders
  WHERE order_date >= DATE_SUB(CURRENT_DATE, INTERVAL 1 YEAR)
  GROUP BY customer_id
),
product_performance AS (
  -- Calculate product metrics
  SELECT
    p.product_id,
    p.product_name,
    p.category,
    COUNT(DISTINCT oi.order_id) AS times_ordered,
    SUM(oi.quantity) AS units_sold,
    SUM(oi.quantity * oi.price) AS total_revenue
  FROM products p
  JOIN order_items oi ON p.product_id = oi.product_id
  JOIN orders o ON oi.order_id = o.order_id
  WHERE o.order_date >= DATE_SUB(CURRENT_DATE, INTERVAL 1 YEAR)
  GROUP BY p.product_id, p.product_name, p.category
),
segment_preferences AS (
  -- Find what each segment buys
  SELECT
    cs.segment,
    pp.category,
    COUNT(DISTINCT o.order_id) AS orders_count,
    SUM(oi.quantity * oi.price) AS revenue,
    ROUND(AVG(oi.price), 2) AS avg_price
  FROM customer_segments cs
  JOIN orders o ON cs.customer_id = o.customer_id
  JOIN order_items oi ON o.order_id = oi.order_id
  JOIN product_performance pp ON oi.product_id = pp.product_id
  GROUP BY cs.segment, pp.category
),
segment_summary AS (
  -- Summarize by segment
  SELECT
    segment,
    SUM(revenue) AS total_segment_revenue
  FROM segment_preferences
  GROUP BY segment
)
SELECT
  sp.segment,
  sp.category,
  sp.orders_count,
  ROUND(sp.revenue, 2) AS category_revenue,
  ROUND(100.0 * sp.revenue / ss.total_segment_revenue, 1) AS pct_of_segment,
  sp.avg_price
FROM segment_preferences sp
JOIN segment_summary ss ON sp.segment = ss.segment
WHERE sp.revenue > 1000  -- Only show significant categories
ORDER BY sp.segment, sp.revenue DESC;

This analysis pipeline shows the power of CTEs:

  1. customer_segments: Classifies customers by purchase behavior
  2. product_performance: Calculates metrics for each product
  3. segment_preferences: Combines customer and product data
  4. segment_summary: Calculates totals for percentage calculations
  5. Final query: Presents insights about what each segment prefers

Without CTEs, this would be a nightmare of nested subqueries. With CTEs, each step is clear and testable.

Common Mistakes

1. Using WITH multiple times for multiple CTEs

Wrong:

WITH cte1 AS (SELECT ...)
WITH cte2 AS (SELECT ...)  -- Error!
SELECT * FROM cte1, cte2;

Correct:

WITH cte1 AS (SELECT ...),
     cte2 AS (SELECT ...)
SELECT * FROM cte1, cte2;

2. Forgetting RECURSIVE keyword for recursive CTEs

Wrong:

WITH org_chart AS (
  SELECT ... WHERE manager_id IS NULL
  UNION ALL
  SELECT ... JOIN org_chart ...  -- Error!
)

Correct:

WITH RECURSIVE org_chart AS (
  SELECT ... WHERE manager_id IS NULL
  UNION ALL
  SELECT ... JOIN org_chart ...
)

3. Creating infinite recursion

Wrong:

WITH RECURSIVE numbers AS (
  SELECT 1 AS n
  UNION ALL
  SELECT n + 1 FROM numbers  -- Never stops!
)
SELECT * FROM numbers;

Correct:

WITH RECURSIVE numbers AS (
  SELECT 1 AS n
  UNION ALL
  SELECT n + 1 FROM numbers WHERE n < 100  -- Termination condition
)
SELECT * FROM numbers;

4. Using UNION instead of UNION ALL in recursive CTEs

Recursive CTEs require UNION ALL, not UNION. Using UNION will cause errors or unexpected behavior.

5. Trying to reference a CTE from another query

CTEs only exist for the duration of a single query. You can't define a CTE and then use it in a separate query.

Key Takeaways

  • CTEs use the WITH clause to define temporary named result sets that exist for one query
  • Multiple CTEs are separated by commas and can reference previous CTEs
  • CTEs improve readability by breaking complex queries into logical, named steps
  • CTEs can be reused multiple times in the same query without re-execution
  • Recursive CTEs use WITH RECURSIVE and include a base case + recursive case with UNION ALL
  • Recursive CTEs are perfect for hierarchies like org charts, file systems, or category trees
  • Always include termination conditions in recursive CTEs to prevent infinite loops
  • CTEs are better than subqueries for complex queries that need clarity
  • CTEs are temporary and don't persist like views do
  • Modern SQL development uses CTEs extensively for maintainable code
When learning CTEs, start by converting complex nested subqueries into CTEs. You'll immediately see the readability improvement and understand when CTEs add value.

Practice Exercise

Using the following tables:

employees (employee_id, name, department, salary, hire_date, manager_id)
departments (department_id, department_name, budget)
projects (project_id, project_name, department_id, start_date, end_date)

Write a query using CTEs that:

  1. Calculates total salary cost per department
  2. Calculates budget utilization percentage (salary cost / budget)
  3. Lists all departments over 80% budget utilization
  4. For those departments, shows which employees were hired in the last year
  5. Includes the department's project count

Try to solve this using at least 3 CTEs, focusing on clarity and logical organization.

Bonus challenge: Use a recursive CTE to show the management hierarchy for employees in over-budget departments.