Indexes & Performance
Speed up queries with indexes - learn when and how to use them effectively.
Introduction to Indexes
Imagine searching for a specific book in a library with 1 million books - but they're randomly arranged on shelves. You'd have to check every book! Now imagine the same library with a card catalog organized alphabetically. Finding your book is instant.
Indexes are database card catalogs. They dramatically speed up data retrieval by creating organized lookup structures. Without indexes, databases scan every row (slow). With indexes, they jump directly to the data (fast).
A data structure that improves the speed of data retrieval operations on a database table. Indexes create a sorted copy of selected columns with pointers to the actual rows, allowing fast lookups without scanning the entire table.
Why use indexes?
- Speed: Find rows in milliseconds instead of seconds
- Performance: Queries run 10x-1000x faster with proper indexes
- Scalability: Handle millions of rows efficiently
- User experience: Faster page loads, better responsiveness
Trade-offs:
- Storage: Indexes take up disk space
- Write speed: INSERT/UPDATE/DELETE slightly slower (must update indexes)
- Maintenance: Indexes need occasional rebuilding
Think of it this way: indexes make reads faster but writes slightly slower.
How Indexes Work
Without an index, the database performs a table scan - checking every row:
-- Without index: scans all 1,000,000 rows
SELECT * FROM customers
WHERE email = 'alice@example.com';With an index on email, the database jumps directly to the row:
-- With index: uses index lookup
SELECT * FROM customers
WHERE email = 'alice@example.com';How it works:
- Index maintains a sorted copy of the email column
- Uses binary search to find 'alice@example.com' quickly
- Retrieves pointer to actual row location
- Fetches row data in one operation
Analogy: Like a book's index at the back - instead of reading every page to find "SQL", you look in the index and jump to page 247.
Creating Indexes
Basic syntax:
CREATE INDEX index_name
ON table_name (column_name);
Let's create indexes on a customers table:
-- Create single-column index
CREATE INDEX idx_customers_email
ON customers (email);-- Create index on last name
CREATE INDEX idx_customers_last_name
ON customers (last_name);Naming convention:
idx_tablename_columnname- Example:
idx_customers_email,idx_orders_date - Makes purpose clear and easy to manage
Types of Indexes
1. Single-column index - Most common, indexes one column:
CREATE INDEX idx_products_price
ON products (price);2. Composite index (multi-column) - Indexes multiple columns together:
-- Index on last_name AND first_name together
CREATE INDEX idx_customers_name
ON customers (last_name, first_name);Composite index benefits:
- Speeds up queries filtering on both columns
- ORDER matters! (last_name, first_name) ≠ (first_name, last_name)
- Can use left portions: last_name alone uses index, but first_name alone doesn't
3. UNIQUE index - Ensures no duplicates + speeds up searches:
-- Unique index on email (no duplicates allowed)
CREATE UNIQUE INDEX idx_users_email
ON users (email);PRIMARY KEY and UNIQUE constraints automatically create unique indexes!
4. Full-text index - For text search in large text columns:
-- Full-text search on article content
CREATE FULLTEXT INDEX idx_articles_content
ON articles (title, content);When to Use Indexes
Create indexes on columns that are:
✅ Frequently in WHERE clauses:
-- Frequent query - needs index
SELECT * FROM orders
WHERE order_date = '2024-01-15';
-- Create index
CREATE INDEX idx_orders_date ON orders (order_date);✅ Used in JOIN conditions:
-- Foreign keys should be indexed
CREATE INDEX idx_orders_customer_id
ON orders (customer_id);
SELECT * FROM orders o
JOIN customers c ON o.customer_id = c.customer_id;✅ Used in ORDER BY:
-- Often sorted by date
CREATE INDEX idx_orders_date ON orders (order_date);
SELECT * FROM orders
ORDER BY order_date DESC;✅ Columns with high cardinality (many unique values):
- Email addresses - GOOD (unique for each user)
- Country - POOR (only ~200 unique values)
- Gender - VERY POOR (only 2-3 values)
When NOT to Use Indexes
Don't create indexes on:
❌ Small tables (< 1000 rows)
- Table scan is faster than index overhead
❌ Columns with low cardinality
- Boolean fields (TRUE/FALSE)
- Gender (M/F)
- Status with few values ('active', 'inactive')
❌ Columns rarely queried
- Don't index unused columns
❌ Tables with frequent writes
- Every INSERT/UPDATE/DELETE must update indexes
- Slows down writes significantly
❌ Very long text columns
- Full-text indexes are better for text search
Rule of thumb: If a column appears in WHERE, JOIN, or ORDER BY frequently AND the table has 1000+ rows, it probably needs an index.
Viewing Existing Indexes
See what indexes exist on a table:
-- MySQL/MariaDB
SHOW INDEX FROM customers;Key columns:
- Non_unique: 0 = unique index, 1 = non-unique
- Key_name: index name
- Column_name: which column(s) are indexed
- Cardinality: approximate unique values
Analyzing Query Performance with EXPLAIN
Use EXPLAIN to see if queries use indexes:
-- Without EXPLAIN - just runs query
SELECT * FROM customers WHERE email = 'alice@example.com';
-- With EXPLAIN - shows execution plan
EXPLAIN SELECT * FROM customers WHERE email = 'alice@example.com';Key columns to watch:
- type: How table is accessed
ALL= table scan (BAD - no index used)index= index scan (OKAY)ref= index lookup (GOOD)const= primary key lookup (EXCELLENT)
- possible_keys: Which indexes could be used
- key: Which index was actually used (NULL = no index!)
- rows: Estimated rows examined (lower is better)
Bad query (no index):
EXPLAIN SELECT * FROM customers
WHERE UPPER(email) = 'ALICE@EXAMPLE.COM';Function UPPER(email) prevents index usage! Database must scan all 10,000 rows.
Composite Indexes - Order Matters
With composite indexes, column order is critical:
-- Create composite index: (last_name, first_name, city)
CREATE INDEX idx_customers_location
ON customers (last_name, first_name, city);This index helps these queries:
-- ✅ Uses index - searches leftmost column
SELECT * FROM customers WHERE last_name = 'Smith';
-- ✅ Uses index - searches first two columns
SELECT * FROM customers
WHERE last_name = 'Smith' AND first_name = 'John';
-- ✅ Uses index - searches all three columns
SELECT * FROM customers
WHERE last_name = 'Smith' AND first_name = 'John' AND city = 'NYC';This index does NOT help these queries:
-- ❌ Cannot use index - skips leftmost column
SELECT * FROM customers WHERE first_name = 'John';
-- ❌ Cannot use index - skips leftmost column
SELECT * FROM customers WHERE city = 'NYC';
-- ⚠️ Partially uses index - only on last_name
SELECT * FROM customers
WHERE last_name = 'Smith' AND city = 'NYC';Leftmost prefix rule: Composite index (A, B, C) works for:
- A
- A, B
- A, B, C
But NOT for: B, C, or B alone, or C alone
Dropping Indexes
Remove unused or redundant indexes:
-- Drop single index
DROP INDEX idx_customers_email ON customers;Be careful! Dropping an index used by frequent queries will hurt performance. Check query patterns before dropping indexes.
Index Maintenance
Over time, indexes can become fragmented. Rebuild periodically:
-- MySQL - recreate index
ALTER TABLE customers DROP INDEX idx_customers_email;
CREATE INDEX idx_customers_email ON customers (email);
-- Or use OPTIMIZE TABLE
OPTIMIZE TABLE customers;When to rebuild:
- After bulk inserts/updates/deletes
- If query performance degrades over time
- During scheduled maintenance windows
Practical Example: E-commerce Performance
Optimize an e-commerce database:
-- Orders table - millions of rows
CREATE TABLE orders (
order_id INT PRIMARY KEY, -- Automatically indexed
customer_id INT NOT NULL,
order_date DATE NOT NULL,
status VARCHAR(20),
total DECIMAL(10, 2)
);
-- Critical indexes for performance
CREATE INDEX idx_orders_customer ON orders (customer_id);
CREATE INDEX idx_orders_date ON orders (order_date);
CREATE INDEX idx_orders_status ON orders (status);
-- Composite index for common query
CREATE INDEX idx_orders_customer_date
ON orders (customer_id, order_date);Query benefits:
-- Fast: uses idx_orders_customer_date
SELECT * FROM orders
WHERE customer_id = 123
ORDER BY order_date DESC;
-- Fast: uses idx_orders_status
SELECT * FROM orders
WHERE status = 'pending';
-- Fast: uses idx_orders_date
SELECT * FROM orders
WHERE order_date >= '2024-01-01';Common Mistakes to Avoid
Don't make these mistakes:
-
Creating too many indexes: ❌
- More indexes = slower writes
- Each INSERT/UPDATE/DELETE updates all indexes
- Start with indexes on most frequent queries
-
Not using EXPLAIN: ❌
- Always check if queries use indexes
EXPLAIN SELECT ...shows execution plan
-
Functions on indexed columns: ❌
WHERE YEAR(order_date) = 2024won't use index- Use
WHERE order_date >= '2024-01-01' AND order_date < '2025-01-01'
-
Wrong composite index order: ❌
- Put most selective column first
- Consider query patterns
-
Indexing low cardinality columns: ❌
- Boolean, gender, status with few values
- Index won't help much
-
Forgetting foreign keys: ❌
- Always index foreign key columns
- Dramatically speeds up JOINs
-
No index on WHERE/ORDER BY columns: ❌
- If you filter/sort by it, index it
Index Best Practices
Scenario | Action |
|---|---|
| WHERE clauses | Index columns used in WHERE |
| JOIN columns | Always index foreign keys |
| ORDER BY | Index sort columns |
| Small tables | Skip indexes (< 1000 rows) |
| Low cardinality | Skip indexes (few unique values) |
| Frequent writes | Minimize indexes |
| Frequent reads | Add indexes liberally |
| Composite indexes | Most selective column first |
Key Takeaways
What you learned:
✅ Indexes speed up queries by creating sorted lookup structures
✅ Trade-off: faster reads, slightly slower writes
✅ CREATE INDEX idx_name ON table (column) - creates index
✅ Index columns used in WHERE, JOIN, ORDER BY
✅ PRIMARY KEY automatically creates unique index
✅ Foreign keys should always be indexed
✅ Composite indexes: order matters (leftmost prefix rule)
✅ Use EXPLAIN to check if queries use indexes
✅ SHOW INDEX FROM table - view existing indexes
✅ Drop unused indexes to improve write performance
✅ Avoid functions on indexed columns
✅ High cardinality columns make better indexes
Performance impact:
- Proper indexes: 10x-1000x faster queries
- Table scan 1M rows: 2+ seconds
- Index lookup: 0.003 seconds
- But every write updates indexes (slight overhead)
Real-world rule: Index columns you frequently filter, join, or sort by!
Practice Exercise:
- Create indexes on customers (email, last_name)
- Use EXPLAIN to check if queries use indexes
- Create composite index (last_name, first_name)
- Compare query performance with and without indexes
- Index foreign keys in orders (customer_id)
- Find queries doing table scans and add indexes