Kekkei - Advancing Education For Everyone

Introduction to UNION Operations

Sometimes you need to combine results from multiple queries into a single result set. For example:

Merge customer lists from multiple regions
Combine current and archived records
Create a unified view of data from different tables
Consolidate similar data for reporting

UNION operations let you stack query results vertically - combining rows from multiple SELECT statements into one result set!

A SQL operator that combines the results of two or more SELECT statements into a single result set. It "stacks" rows vertically, removing duplicate rows by default. UNION ALL includes all rows including duplicates.

Key concepts:

Combines results vertically (adding rows)
Different from JOINs which combine horizontally (adding columns)
All SELECT statements must have same number of columns
Column data types must be compatible
Column names come from the first SELECT

Think of UNION as "stacking" query results on top of each other.

Basic UNION Syntax

Syntax:

SELECT column1, column2
FROM table1
UNION
SELECT column1, column2
FROM table2;

Let's work with two employee tables from different offices:

us_employees

id	name	office	salary
1	Alice Smith	New York	95000
2	Bob Johnson	New York	85000
3	Carol White	Chicago	90000

uk_employees

id	name	office	salary
101	David Brown	London	80000
102	Emma Wilson	Manchester	75000
103	Frank Miller	London	95000

sql

-- Combine all employees from both offices
SELECT id, name, office, salary
FROM us_employees
UNION
SELECT id, name, office, salary
FROM uk_employees;

What happened:

First SELECT gets rows from us_employees
Second SELECT gets rows from uk_employees
UNION combines them into one result set
All 6 rows are returned (no duplicates in this case)

UNION vs UNION ALL

UNION removes duplicate rows, UNION ALL keeps all rows including duplicates.

Let's see the difference with overlapping data:

current_customers

customer_id	name	status
1	Acme Corp	active
2	TechStart Inc	active
3	Global Ltd	active

vip_customers

customer_id	name	status
2	TechStart Inc	vip
3	Global Ltd	vip
4	Elite Co	vip

sql

-- UNION (removes duplicates)
SELECT customer_id, name
FROM current_customers
UNION
SELECT customer_id, name
FROM vip_customers
ORDER BY customer_id;

Notice customers 2 and 3 appear only once - UNION removed duplicates!

Now with UNION ALL:

sql

-- UNION ALL (keeps duplicates)
SELECT customer_id, name
FROM current_customers
UNION ALL
SELECT customer_id, name
FROM vip_customers
ORDER BY customer_id;

With UNION ALL, customers 2 and 3 appear twice - all rows kept!

When to use each:

UNION: When you want unique rows only (slower, removes duplicates)
UNION ALL: When you want all rows including duplicates (faster, no deduplication)

Performance tip: Use UNION ALL when you know there are no duplicates or duplicates are acceptable. It's faster because it skips the deduplication step.

UNION Requirements - Same Structure

All SELECT statements in a UNION must have:

Same number of columns
Compatible data types (same position)
Columns don't need same names (first SELECT determines names)

sql

-- This works - same structure
SELECT id, name, 'US' AS region
FROM us_employees
UNION
SELECT id, name, 'UK' AS region
FROM uk_employees;

This fails - different number of columns:

SELECT id, name FROM us_employees
UNION
SELECT id, name, office FROM uk_employees;  -- Error! 2 vs 3 columns

This fails - incompatible types:

SELECT id, name FROM us_employees
UNION
SELECT name, id FROM uk_employees;  -- Error! INT vs VARCHAR mismatch

Adding Identifying Columns

Often you want to know which query each row came from:

sql

-- Add source identifier
SELECT
    name,
    office,
    salary,
    'US Office' AS source
FROM us_employees
UNION ALL
SELECT
    name,
    office,
    salary,
    'UK Office' AS source
FROM uk_employees
ORDER BY salary DESC;

The source column helps identify where each row originated!

Combining More Than Two Queries

You can UNION multiple SELECT statements:

q1_sales

product	sales
Laptop	50000
Mouse	5000

q2_sales

product	sales
Laptop	55000
Keyboard	8000

q3_sales

product	sales
Laptop	60000
Monitor	30000

sql

-- Combine sales from all quarters
SELECT product, sales, 'Q1' AS quarter FROM q1_sales
UNION ALL
SELECT product, sales, 'Q2' AS quarter FROM q2_sales
UNION ALL
SELECT product, sales, 'Q3' AS quarter FROM q3_sales
ORDER BY product, quarter;

UNION with WHERE Clauses

Each SELECT can have its own WHERE clause:

sql

-- High earners from both offices
SELECT name, office, salary, 'US' AS region
FROM us_employees
WHERE salary >= 90000
UNION
SELECT name, office, salary, 'UK' AS region
FROM uk_employees
WHERE salary >= 90000
ORDER BY salary DESC;

UNION with Aggregations

Combine aggregated results from different tables:

sql

-- Summary statistics from both offices
SELECT
    'US Office' AS office,
    COUNT(*) AS employee_count,
    AVG(salary) AS avg_salary,
    MAX(salary) AS max_salary
FROM us_employees
UNION ALL
SELECT
    'UK Office' AS office,
    COUNT(*) AS employee_count,
    AVG(salary) AS avg_salary,
    MAX(salary) AS max_salary
FROM uk_employees;

UNION with JOINs

Each SELECT in a UNION can include JOINs:

orders_2023

order_id	customer_id	total
1	1	500
2	2	300

orders_2024

order_id	customer_id	total
101	1	600
102	3	450

customers

customer_id	name
1	Alice Corp
2	Bob Inc
3	Carol Ltd

sql

-- All orders with customer names from both years
SELECT
    o.order_id,
    c.name AS customer,
    o.total,
    2023 AS year
FROM orders_2023 o
JOIN customers c ON o.customer_id = c.customer_id
UNION ALL
SELECT
    o.order_id,
    c.name AS customer,
    o.total,
    2024 AS year
FROM orders_2024 o
JOIN customers c ON o.customer_id = c.customer_id
ORDER BY year, order_id;

Practical Example: Customer Contact List

Combine multiple sources into a unified contact list:

newsletter_subscribers

email	name
alice@example.com	Alice Smith
bob@example.com	Bob Johnson

customers

email	first_name	last_name
bob@example.com	Bob	Johnson
carol@example.com	Carol	White

event_attendees

attendee_email	attendee_name
carol@example.com	Carol White
david@example.com	David Brown

sql

-- Unified contact list from all sources
SELECT
    email,
    name,
    'Newsletter' AS source
FROM newsletter_subscribers
UNION
SELECT
    email,
    CONCAT(first_name, ' ', last_name) AS name,
    'Customer' AS source
FROM customers
UNION
SELECT
    attendee_email AS email,
    attendee_name AS name,
    'Event' AS source
FROM event_attendees
ORDER BY email;

Notice duplicates were removed! Bob and Carol appear in multiple sources but show only once.

To see all occurrences:

sql

-- Show all occurrences with UNION ALL
SELECT email, name, 'Newsletter' AS source
FROM newsletter_subscribers
UNION ALL
SELECT email, CONCAT(first_name, ' ', last_name), 'Customer'
FROM customers
UNION ALL
SELECT attendee_email, attendee_name, 'Event'
FROM event_attendees
ORDER BY email, source;

UNION with NULL Values

Columns that don't exist in all queries can use NULL:

sql

-- Combine products with different structures
SELECT
    product_name,
    price,
    NULL AS discount_percent
FROM regular_products
UNION ALL
SELECT
    product_name,
    price,
    discount
FROM clearance_products
WHERE discount > 0;

This ensures all queries have the same number of columns!

Common Mistakes to Avoid

Don't make these mistakes:

Different number of columns: ❌
- SELECT a, b UNION SELECT a, b, c
- All queries must have same column count
Incompatible data types: ❌
- SELECT id, name UNION SELECT name, id
- Column types must match by position
Using UNION when UNION ALL is better: ⚠️
- UNION removes duplicates (slower)
- If no duplicates exist or duplicates are OK, use UNION ALL
Forgetting ORDER BY goes at the end: ❌
- SELECT * FROM t1 ORDER BY id UNION SELECT * FROM t2
- Correct: SELECT * FROM t1 UNION SELECT * FROM t2 ORDER BY id
Not aliasing columns from first SELECT: ❌
- Column names come from first SELECT only
- Alias columns there if you want specific names
Assuming column names must match: ❌
- Names don't need to match, only count and types
- Names from first SELECT are used
Using UNION on large tables without filters: ⚠️
- Filter with WHERE before combining
- Reduces data volume and improves performance

Performance Considerations

UNION performance tips:

Use UNION ALL when possible
- Skip duplicate checking if not needed
- Can be significantly faster
Filter before combining
- Add WHERE clauses to each SELECT
- Reduces rows to process
Index important columns
- Especially columns in WHERE clauses
- Helps each individual query run faster
Consider alternatives for very large unions
- If combining many tables, consider views
- Or restructure schema if constantly using UNION

Key Takeaways

What you learned:

✅ UNION combines results from multiple SELECT statements vertically ✅ UNION removes duplicate rows (default) ✅ UNION ALL keeps all rows including duplicates (faster) ✅ All SELECT statements must have:

Same number of columns
Compatible data types (by position) ✅ Column names come from first SELECT ✅ ORDER BY goes at the very end (after all UNIONs) ✅ Each SELECT can have its own WHERE, JOINs, GROUP BY ✅ Can add literal columns to identify source ✅ Use NULL to pad missing columns

Real-world applications:

Combining data from multiple regions/offices
Merging current and archived tables
Creating unified contact lists
Consolidating historical data by time period
Building comprehensive reports from multiple sources

Practice Exercise:

Combine customers from two regions, add a region column
Create a unified product list from multiple categories
Combine sales data from Q1, Q2, Q3, Q4 with quarter labels
Merge employee lists showing only those earning > 80000
Create a master email list from customers, newsletter, and event attendees
Compare UNION vs UNION ALL performance on duplicate data
Combine monthly summaries (COUNT, SUM) from 12 monthly tables

Introduction to UNION OperationsFocusStart Focus Mode

Basic UNION SyntaxFocusStart Focus Mode

UNION vs UNION ALLFocusStart Focus Mode

UNION Requirements - Same StructureFocusStart Focus Mode

Adding Identifying ColumnsFocusStart Focus Mode

Combining More Than Two QueriesFocusStart Focus Mode

UNION with WHERE ClausesFocusStart Focus Mode

UNION with AggregationsFocusStart Focus Mode

UNION with JOINsFocusStart Focus Mode

Practical Example: Customer Contact ListFocusStart Focus Mode

UNION with NULL ValuesFocusStart Focus Mode

Common Mistakes to AvoidFocusStart Focus Mode

Performance ConsiderationsFocusStart Focus Mode

Key TakeawaysFocusStart Focus Mode