Indexes & Performance

Speed up queries with indexes - learn when and how to use them effectively.

18 min read
Intermediate

Introduction to Indexes

Imagine searching for a specific book in a library with 1 million books - but they're randomly arranged on shelves. You'd have to check every book! Now imagine the same library with a card catalog organized alphabetically. Finding your book is instant.

Indexes are database card catalogs. They dramatically speed up data retrieval by creating organized lookup structures. Without indexes, databases scan every row (slow). With indexes, they jump directly to the data (fast).

A data structure that improves the speed of data retrieval operations on a database table. Indexes create a sorted copy of selected columns with pointers to the actual rows, allowing fast lookups without scanning the entire table.

Why use indexes?

  • Speed: Find rows in milliseconds instead of seconds
  • Performance: Queries run 10x-1000x faster with proper indexes
  • Scalability: Handle millions of rows efficiently
  • User experience: Faster page loads, better responsiveness

Trade-offs:

  • Storage: Indexes take up disk space
  • Write speed: INSERT/UPDATE/DELETE slightly slower (must update indexes)
  • Maintenance: Indexes need occasional rebuilding

Think of it this way: indexes make reads faster but writes slightly slower.

How Indexes Work

Without an index, the database performs a table scan - checking every row:

sql
-- Without index: scans all 1,000,000 rows
SELECT * FROM customers
WHERE email = 'alice@example.com';

With an index on email, the database jumps directly to the row:

sql
-- With index: uses index lookup
SELECT * FROM customers
WHERE email = 'alice@example.com';

How it works:

  1. Index maintains a sorted copy of the email column
  2. Uses binary search to find 'alice@example.com' quickly
  3. Retrieves pointer to actual row location
  4. Fetches row data in one operation

Analogy: Like a book's index at the back - instead of reading every page to find "SQL", you look in the index and jump to page 247.

Creating Indexes

Basic syntax:

CREATE INDEX index_name
ON table_name (column_name);

Let's create indexes on a customers table:

sql
-- Create single-column index
CREATE INDEX idx_customers_email
ON customers (email);
sql
-- Create index on last name
CREATE INDEX idx_customers_last_name
ON customers (last_name);

Naming convention:

  • idx_tablename_columnname
  • Example: idx_customers_email, idx_orders_date
  • Makes purpose clear and easy to manage

Types of Indexes

1. Single-column index - Most common, indexes one column:

sql
CREATE INDEX idx_products_price
ON products (price);

2. Composite index (multi-column) - Indexes multiple columns together:

sql
-- Index on last_name AND first_name together
CREATE INDEX idx_customers_name
ON customers (last_name, first_name);

Composite index benefits:

  • Speeds up queries filtering on both columns
  • ORDER matters! (last_name, first_name) ≠ (first_name, last_name)
  • Can use left portions: last_name alone uses index, but first_name alone doesn't

3. UNIQUE index - Ensures no duplicates + speeds up searches:

sql
-- Unique index on email (no duplicates allowed)
CREATE UNIQUE INDEX idx_users_email
ON users (email);

PRIMARY KEY and UNIQUE constraints automatically create unique indexes!

4. Full-text index - For text search in large text columns:

sql
-- Full-text search on article content
CREATE FULLTEXT INDEX idx_articles_content
ON articles (title, content);

When to Use Indexes

Create indexes on columns that are:

Frequently in WHERE clauses:

sql
-- Frequent query - needs index
SELECT * FROM orders
WHERE order_date = '2024-01-15';

-- Create index
CREATE INDEX idx_orders_date ON orders (order_date);

Used in JOIN conditions:

sql
-- Foreign keys should be indexed
CREATE INDEX idx_orders_customer_id
ON orders (customer_id);

SELECT * FROM orders o
JOIN customers c ON o.customer_id = c.customer_id;

Used in ORDER BY:

sql
-- Often sorted by date
CREATE INDEX idx_orders_date ON orders (order_date);

SELECT * FROM orders
ORDER BY order_date DESC;

Columns with high cardinality (many unique values):

  • Email addresses - GOOD (unique for each user)
  • Country - POOR (only ~200 unique values)
  • Gender - VERY POOR (only 2-3 values)

When NOT to Use Indexes

Don't create indexes on:

Small tables (< 1000 rows)

  • Table scan is faster than index overhead

Columns with low cardinality

  • Boolean fields (TRUE/FALSE)
  • Gender (M/F)
  • Status with few values ('active', 'inactive')

Columns rarely queried

  • Don't index unused columns

Tables with frequent writes

  • Every INSERT/UPDATE/DELETE must update indexes
  • Slows down writes significantly

Very long text columns

  • Full-text indexes are better for text search

Rule of thumb: If a column appears in WHERE, JOIN, or ORDER BY frequently AND the table has 1000+ rows, it probably needs an index.

Viewing Existing Indexes

See what indexes exist on a table:

sql
-- MySQL/MariaDB
SHOW INDEX FROM customers;

Key columns:

  • Non_unique: 0 = unique index, 1 = non-unique
  • Key_name: index name
  • Column_name: which column(s) are indexed
  • Cardinality: approximate unique values

Analyzing Query Performance with EXPLAIN

Use EXPLAIN to see if queries use indexes:

sql
-- Without EXPLAIN - just runs query
SELECT * FROM customers WHERE email = 'alice@example.com';

-- With EXPLAIN - shows execution plan
EXPLAIN SELECT * FROM customers WHERE email = 'alice@example.com';

Key columns to watch:

  • type: How table is accessed
    • ALL = table scan (BAD - no index used)
    • index = index scan (OKAY)
    • ref = index lookup (GOOD)
    • const = primary key lookup (EXCELLENT)
  • possible_keys: Which indexes could be used
  • key: Which index was actually used (NULL = no index!)
  • rows: Estimated rows examined (lower is better)

Bad query (no index):

sql
EXPLAIN SELECT * FROM customers
WHERE UPPER(email) = 'ALICE@EXAMPLE.COM';

Function UPPER(email) prevents index usage! Database must scan all 10,000 rows.

Composite Indexes - Order Matters

With composite indexes, column order is critical:

sql
-- Create composite index: (last_name, first_name, city)
CREATE INDEX idx_customers_location
ON customers (last_name, first_name, city);

This index helps these queries:

sql
-- ✅ Uses index - searches leftmost column
SELECT * FROM customers WHERE last_name = 'Smith';

-- ✅ Uses index - searches first two columns
SELECT * FROM customers
WHERE last_name = 'Smith' AND first_name = 'John';

-- ✅ Uses index - searches all three columns
SELECT * FROM customers
WHERE last_name = 'Smith' AND first_name = 'John' AND city = 'NYC';

This index does NOT help these queries:

sql
-- ❌ Cannot use index - skips leftmost column
SELECT * FROM customers WHERE first_name = 'John';

-- ❌ Cannot use index - skips leftmost column
SELECT * FROM customers WHERE city = 'NYC';

-- ⚠️ Partially uses index - only on last_name
SELECT * FROM customers
WHERE last_name = 'Smith' AND city = 'NYC';

Leftmost prefix rule: Composite index (A, B, C) works for:

  • A
  • A, B
  • A, B, C

But NOT for: B, C, or B alone, or C alone

Dropping Indexes

Remove unused or redundant indexes:

sql
-- Drop single index
DROP INDEX idx_customers_email ON customers;

Be careful! Dropping an index used by frequent queries will hurt performance. Check query patterns before dropping indexes.

Index Maintenance

Over time, indexes can become fragmented. Rebuild periodically:

sql
-- MySQL - recreate index
ALTER TABLE customers DROP INDEX idx_customers_email;
CREATE INDEX idx_customers_email ON customers (email);

-- Or use OPTIMIZE TABLE
OPTIMIZE TABLE customers;

When to rebuild:

  • After bulk inserts/updates/deletes
  • If query performance degrades over time
  • During scheduled maintenance windows

Practical Example: E-commerce Performance

Optimize an e-commerce database:

sql
-- Orders table - millions of rows
CREATE TABLE orders (
    order_id INT PRIMARY KEY,  -- Automatically indexed
    customer_id INT NOT NULL,
    order_date DATE NOT NULL,
    status VARCHAR(20),
    total DECIMAL(10, 2)
);

-- Critical indexes for performance
CREATE INDEX idx_orders_customer ON orders (customer_id);
CREATE INDEX idx_orders_date ON orders (order_date);
CREATE INDEX idx_orders_status ON orders (status);

-- Composite index for common query
CREATE INDEX idx_orders_customer_date
ON orders (customer_id, order_date);

Query benefits:

sql
-- Fast: uses idx_orders_customer_date
SELECT * FROM orders
WHERE customer_id = 123
ORDER BY order_date DESC;

-- Fast: uses idx_orders_status
SELECT * FROM orders
WHERE status = 'pending';

-- Fast: uses idx_orders_date
SELECT * FROM orders
WHERE order_date >= '2024-01-01';

Common Mistakes to Avoid

Don't make these mistakes:

  1. Creating too many indexes: ❌

    • More indexes = slower writes
    • Each INSERT/UPDATE/DELETE updates all indexes
    • Start with indexes on most frequent queries
  2. Not using EXPLAIN: ❌

    • Always check if queries use indexes
    • EXPLAIN SELECT ... shows execution plan
  3. Functions on indexed columns: ❌

    • WHERE YEAR(order_date) = 2024 won't use index
    • Use WHERE order_date >= '2024-01-01' AND order_date < '2025-01-01'
  4. Wrong composite index order: ❌

    • Put most selective column first
    • Consider query patterns
  5. Indexing low cardinality columns: ❌

    • Boolean, gender, status with few values
    • Index won't help much
  6. Forgetting foreign keys: ❌

    • Always index foreign key columns
    • Dramatically speeds up JOINs
  7. No index on WHERE/ORDER BY columns: ❌

    • If you filter/sort by it, index it

Index Best Practices

Indexing Guidelines
Scenario
Action
WHERE clausesIndex columns used in WHERE
JOIN columnsAlways index foreign keys
ORDER BYIndex sort columns
Small tablesSkip indexes (< 1000 rows)
Low cardinalitySkip indexes (few unique values)
Frequent writesMinimize indexes
Frequent readsAdd indexes liberally
Composite indexesMost selective column first

Key Takeaways

What you learned:

Indexes speed up queries by creating sorted lookup structures ✅ Trade-off: faster reads, slightly slower writes ✅ CREATE INDEX idx_name ON table (column) - creates index ✅ Index columns used in WHERE, JOIN, ORDER BY ✅ PRIMARY KEY automatically creates unique index ✅ Foreign keys should always be indexed ✅ Composite indexes: order matters (leftmost prefix rule) ✅ Use EXPLAIN to check if queries use indexes ✅ SHOW INDEX FROM table - view existing indexes ✅ Drop unused indexes to improve write performance ✅ Avoid functions on indexed columns ✅ High cardinality columns make better indexes

Performance impact:

  • Proper indexes: 10x-1000x faster queries
  • Table scan 1M rows: 2+ seconds
  • Index lookup: 0.003 seconds
  • But every write updates indexes (slight overhead)

Real-world rule: Index columns you frequently filter, join, or sort by!

Practice Exercise:

  1. Create indexes on customers (email, last_name)
  2. Use EXPLAIN to check if queries use indexes
  3. Create composite index (last_name, first_name)
  4. Compare query performance with and without indexes
  5. Index foreign keys in orders (customer_id)
  6. Find queries doing table scans and add indexes