Database Query Optimization: EXPLAIN ANALYZE, Indexes, and the Queries That Kill Production

A database query that runs in 3ms with 1,000 rows will often take 4,000ms with 1,000,000 rows. The query didn't change. The data changed. Most production database incidents aren't caused by code bugs — they're caused by queries that worked fine in staging (with 5,000 rows) and collapsed in production (with 50,000,000 rows).
This guide covers the tools and patterns for diagnosing and fixing slow queries in PostgreSQL and MySQL: reading query plans, choosing the right indexes, eliminating N+1 queries, and avoiding the schema patterns that guarantee future pain.
The Problem: Queries That Look Fine But Aren't
Consider this query:
SELECT u.name, u.email, COUNT(o.id) as order_count FROM users u LEFT JOIN orders o ON o.user_id = u.id WHERE u.created_at > '2025-01-01' GROUP BY u.id, u.name, u.email ORDER BY order_count DESC LIMIT 20;
In development with 500 users and 2,000 orders: 4ms. In production with 2M users and 80M orders: 45 seconds. A sequential scan of 80M rows, a hash join, and a sort of 2M groups — all because there's no index on orders.user_id and users.created_at.
The tools to find and fix this exist in every database. Most developers don't know how to read them.
How It Works: Reading EXPLAIN ANALYZE
EXPLAIN ANALYZE is the most important debugging tool for slow queries. It shows the query plan the database chose *and* the actual execution statistics.
EXPLAIN (ANALYZE, BUFFERS, FORMAT TEXT) SELECT u.name, u.email, COUNT(o.id) as order_count FROM users u LEFT JOIN orders o ON o.user_id = u.id WHERE u.created_at > '2025-01-01' GROUP BY u.id, u.name, u.email ORDER BY order_count DESC LIMIT 20;
Sample output (annotated):
Limit (cost=248542.33..248542.38 rows=20 width=48) (actual time=44812.221..44812.226 rows=20 loops=1)
-> Sort (cost=248542.33..253447.89 rows=1962224 width=48) (actual time=44812.219..44812.221 rows=20 loops=1)
Sort Key: (count(o.id)) DESC
Sort Method: top-N heapsort Memory: 26kB
-> HashAggregate (cost=185899.07..205521.31 rows=1962224 width=48) (actual time=38211.442..43187.553 rows=1962224 loops=1)
Group Key: u.id
Batches: 5 Memory Usage: 4145kB Disk Usage: 36608kB ← SPILLING TO DISK
-> Hash Left Join (cost=44729.61..163355.95 rows=4508624 width=24) (actual time=1203.442..28847.221 rows=80124532 loops=1)
Hash Cond: (o.user_id = u.id)
-> Seq Scan on orders o (cost=0.00..82441.32 rows=4508624 width=8) (actual time=0.021..8442.112 rows=80124532 loops=1)
↑ SEQUENTIAL SCAN — 80M rows
-> Hash (cost=29972.87..29972.87 rows=1182939 width=24) (actual time=1190.221..1190.221 rows=1962224 loops=1)
Buckets: 2097152 Batches: 2 Memory Usage: 50421kB
-> Seq Scan on users u (cost=0.00..29972.87 rows=1182939 width=24) (actual time=0.012..892.442 rows=1962224 loops=1)
Filter: (created_at > '2025-01-01 00:00:00'::timestamp)
Rows Removed by Filter: 37776 ← Only removes 37K rows from 2M scan
Planning Time: 2.442 ms
Execution Time: 44814.221 ms ← 44 seconds
The key numbers to read:
| What to look for | Where it is | What it means |
|-----------------|-------------|---------------|
| Seq Scan on large table | Node type | Missing index — table scanned row by row |
| High actual time vs cost | Each node | Planner estimate was wrong |
| Disk Usage in aggregation | HashAggregate | Query spilling to disk — work_mem too low |
| Rows Removed by Filter | Seq Scan | Filter applied after reading all rows |
| loops=N where N > 1 | Nested Loop | Inner relation scanned N times |
flowchart TD
A[EXPLAIN ANALYZE output] --> B{Seq Scan?}
B -- Yes on large table --> C[Add index on scan columns]
B -- No --> D{Nested Loop with loops > 100?}
D -- Yes --> E[N+1 query or missing join index]
D -- No --> F{Disk Usage in aggregation?}
F -- Yes --> G[Increase work_mem or optimize grouping]
F -- No --> H{High actual vs estimated rows?}
H -- Yes --> I[Run ANALYZE to update statistics]
H -- No --> J[Query is probably fine]
style C fill:#ef4444,color:#fff
style E fill:#ef4444,color:#fff
style G fill:#f59e0b,color:#fff
style I fill:#f59e0b,color:#fff
style J fill:#22c55e,color:#fff
Implementation: Fixing the Common Culprits
Fix 1: Add the Right Indexes
The query above needs two indexes:
-- Index for the WHERE clause on users CREATE INDEX idx_users_created_at ON users(created_at); -- Index for the JOIN condition on orders CREATE INDEX idx_orders_user_id ON orders(user_id);
After adding these, EXPLAIN ANALYZE shows:
Execution Time: 312 ms ← Down from 44,814ms
Choosing index types:
-- B-tree (default): range queries, equality, ORDER BY
CREATE INDEX idx_orders_created_at ON orders(created_at); -- WHERE created_at > '...'
-- Hash: equality only (faster for = than B-tree, no range support)
-- PostgreSQL 10+ only, rarely needed
CREATE INDEX idx_orders_status_hash ON orders USING HASH (status);
-- Partial index: index only rows that match a condition
-- Smaller index, faster for filtered queries
CREATE INDEX idx_orders_pending ON orders(user_id)
WHERE status = 'pending'; -- Only indexes pending orders
-- Composite index: column order matters
-- This index helps: WHERE user_id = ? AND created_at > ?
-- Also helps: WHERE user_id = ? (leading column)
-- Does NOT help: WHERE created_at > ? (non-leading column)
CREATE INDEX idx_orders_user_created ON orders(user_id, created_at);
-- Covering index: include non-indexed columns to avoid table heap access
-- Query only touches the index, never the table
CREATE INDEX idx_orders_covering ON orders(user_id, created_at)
INCLUDE (status, total_amount);
The most common mistake: Creating an index on a column used in a WHERE clause when the column has low cardinality. An index on status where status is one of three values ('pending', 'active', 'closed') helps nothing — the planner will still do a sequential scan because 33% of rows match, and a seq scan is faster than reading 33M index entries.
-- WRONG: low cardinality, index will be ignored CREATE INDEX idx_users_is_active ON users(is_active); -- Only 2 values -- RIGHT: high cardinality, or restrict with partial index CREATE INDEX idx_users_active ON users(id) WHERE is_active = true; -- Now a small index covering only active users
Fix 2: The N+1 Query Destroyer
N+1 is the single most common database performance problem in ORMs. The pattern:
# ORM code that looks innocent
users = User.objects.filter(created_at__gt='2025-01-01')
for user in users:
# THIS IS N+1 — one query per user
print(user.orders.count())
What happens: 1 query to get N users, then N queries to get each user's order count. With 10,000 users: 10,001 database round trips. At 2ms per round trip: 20 seconds.
-- What the ORM is running (10,001 queries): SELECT * FROM users WHERE created_at > '2025-01-01'; SELECT COUNT(*) FROM orders WHERE user_id = 1; SELECT COUNT(*) FROM orders WHERE user_id = 2; -- ... × 9,999 more
The fix: annotate or JOIN to get everything in one query:
# Django: annotate with COUNT in a single query
from django.db.models import Count
users = (
User.objects
.filter(created_at__gt='2025-01-01')
.annotate(order_count=Count('orders'))
.order_by('-order_count')[:20]
)
# 1 query total: the JOIN + COUNT SQL from earlier
# SQLAlchemy: same idea
from sqlalchemy import func
result = (
session.query(User, func.count(Order.id).label('order_count'))
.outerjoin(Order, Order.user_id == User.id)
.filter(User.created_at > '2025-01-01')
.group_by(User.id)
.order_by(desc('order_count'))
.limit(20)
.all()
)
Detecting N+1 in production:
# Django Debug Toolbar shows query count per request
# For production monitoring, track queries-per-request in your APM
# Or log slow query patterns manually:
import django.db.backends.utils as db_utils
import logging
original = db_utils.CursorWrapper.execute
def patched_execute(self, sql, params=None):
logging.debug(f"QUERY: {sql[:200]}")
return original(self, sql, params)
db_utils.CursorWrapper.execute = patched_execute
Fix 3: JSONB Performance (PostgreSQL)
Storing semi-structured data in JSONB columns is convenient but has performance traps:
-- SLOW: no index on JSONB field, full table scan
SELECT * FROM events WHERE metadata->>'user_id' = '12345';
-- FIX 1: GIN index on entire JSONB column (handles any key lookup)
CREATE INDEX idx_events_metadata_gin ON events USING GIN (metadata);
-- FIX 2: Expression index on specific key (smaller, faster for single key)
CREATE INDEX idx_events_user_id ON events((metadata->>'user_id'));
-- FIX 3: If you always query the same keys, use a generated column
ALTER TABLE events
ADD COLUMN user_id_extracted TEXT
GENERATED ALWAYS AS (metadata->>'user_id') STORED;
CREATE INDEX idx_events_user_id_col ON events(user_id_extracted);
-- Now it's just a regular column with a regular index
Fix 4: The Pagination Trap
OFFSET-based pagination is fine with small datasets. With large ones, it's a silent killer:
-- SLOW: scans and discards 999,980 rows to return rows 999,981-1,000,000
SELECT * FROM events ORDER BY created_at DESC LIMIT 20 OFFSET 999980;
-- FIX: cursor-based pagination using the last seen ID
-- First page:
SELECT * FROM events ORDER BY created_at DESC, id DESC LIMIT 20;
-- Subsequent pages (pass last_created_at and last_id from previous response):
SELECT * FROM events
WHERE (created_at, id) < ('2025-06-15 10:23:44', 98765)
ORDER BY created_at DESC, id DESC
LIMIT 20;
The cursor approach scans exactly 20 rows regardless of which page you're on. The OFFSET approach scans N+20 rows every time.
Query Patterns to Avoid
flowchart LR
A[Dangerous Patterns]
A --> B["SELECT * (fetches all columns)"]
A --> C["WHERE LOWER(email) = ... (function prevents index use)"]
A --> D["WHERE id::text = '123' (cast prevents index use)"]
A --> E["LIKE '%search%' (leading wildcard = seq scan)"]
A --> F["OR conditions on different columns (index skipped)"]
B --> B2["SELECT only needed columns"]
C --> C2["Use citext column type or expression index"]
D --> D2["Match column type in WHERE clause"]
E --> E2["Full-text search (tsvector/tsquery)"]
F --> F2["Separate queries with UNION ALL"]
style A fill:#ef4444,color:#fff
-- FUNCTION IN WHERE: prevents index use
-- SLOW:
SELECT * FROM users WHERE LOWER(email) = 'user@example.com';
-- FIX: use citext extension for case-insensitive storage
CREATE EXTENSION IF NOT EXISTS citext;
ALTER TABLE users ALTER COLUMN email TYPE citext;
-- Now: WHERE email = 'USER@EXAMPLE.COM' uses the index
-- OR: use an expression index
CREATE INDEX idx_users_email_lower ON users(LOWER(email));
SELECT * FROM users WHERE LOWER(email) = 'user@example.com'; -- Now uses index
-- LEADING WILDCARD: full table scan
-- SLOW:
SELECT * FROM products WHERE name LIKE '%widget%';
-- FIX: full-text search with GIN index
ALTER TABLE products ADD COLUMN search_vec tsvector
GENERATED ALWAYS AS (to_tsvector('english', name || ' ' || description)) STORED;
CREATE INDEX idx_products_search ON products USING GIN(search_vec);
SELECT * FROM products WHERE search_vec @@ plainto_tsquery('english', 'widget');
VACUUM, Bloat, and Table Maintenance
PostgreSQL doesn't immediately reclaim storage when rows are updated or deleted. Old row versions are kept for MVCC (Multi-Version Concurrency Control) — concurrent readers might still need them. VACUUM reclaims this dead space. When it's not running aggressively enough, tables bloat, queries slow down, and eventually the transaction ID wraparound problem causes the database to refuse writes entirely.
-- Check table bloat
SELECT
schemaname,
tablename,
pg_size_pretty(pg_total_relation_size(schemaname||'.'||tablename)) AS total_size,
pg_size_pretty(
pg_total_relation_size(schemaname||'.'||tablename) -
pg_relation_size(schemaname||'.'||tablename)
) AS index_size,
n_dead_tup,
n_live_tup,
round(n_dead_tup::numeric / NULLIF(n_live_tup + n_dead_tup, 0) * 100, 1) AS dead_pct
FROM pg_stat_user_tables
WHERE n_dead_tup > 10000
ORDER BY n_dead_tup DESC;
High dead_pct means bloat. The fix: adjust autovacuum settings for high-traffic tables.
-- Per-table autovacuum tuning (override global settings for busy tables)
ALTER TABLE orders SET (
autovacuum_vacuum_scale_factor = 0.01, -- Vacuum when 1% of rows are dead (default: 20%)
autovacuum_analyze_scale_factor = 0.005, -- Analyze when 0.5% of rows change
autovacuum_vacuum_cost_delay = 2, -- Less I/O throttling for this table
autovacuum_vacuum_threshold = 50 -- Minimum dead rows before vacuum runs
);
-- For tables with extremely high write volume, trigger manual vacuum:
VACUUM ANALYZE orders; -- Reclaims dead rows, updates statistics
-- VACUUM FULL reclaims more space but locks the table (avoid on production unless necessary)
-- Use pg_repack extension instead for zero-downtime table compaction
Connection Pooling and Query Performance
Database connections are expensive: each one holds memory on the server, maintains state, and requires a network handshake to establish. Without connection pooling, a spike to 500 concurrent web workers means 500 simultaneous database connections. PostgreSQL can handle ~100-200 connections efficiently; beyond that, performance degrades sharply.
PgBouncer is the standard connection pooler for PostgreSQL:
# pgbouncer.ini [databases] myapp = host=localhost port=5432 dbname=myapp [pgbouncer] listen_port = 6432 listen_addr = 0.0.0.0 auth_type = scram-sha-256 auth_file = /etc/pgbouncer/userlist.txt # Transaction mode: connection returned to pool after each transaction # Best for stateless web apps; doesn't support SET, advisory locks pool_mode = transaction max_client_conn = 2000 # Max simultaneous app connections default_pool_size = 25 # Actual connections to PostgreSQL max_db_connections = 100 # Hard limit per database
With PgBouncer in transaction mode: 2,000 application workers share 25 actual PostgreSQL connections. The database sees a steady 25 connections regardless of web traffic spikes.
The query performance implication: connection overhead is removed from the hot path. Queries that were spending 2-3ms establishing connections now start immediately. At high throughput, this compounds: 3ms × 1,000 req/s = 3 seconds of connection overhead per second, eliminated.
Production Monitoring: Finding Slow Queries
Don't wait for incidents. Enable pg_stat_statements to track query performance continuously:
-- Enable in postgresql.conf:
-- shared_preload_libraries = 'pg_stat_statements'
-- pg_stat_statements.max = 10000
-- pg_stat_statements.track = all
CREATE EXTENSION IF NOT EXISTS pg_stat_statements;
-- Find the 10 slowest queries by total time
SELECT
round(total_exec_time::numeric, 2) AS total_ms,
round(mean_exec_time::numeric, 2) AS avg_ms,
calls,
round(stddev_exec_time::numeric, 2) AS stddev_ms,
left(query, 120) AS query_preview
FROM pg_stat_statements
WHERE calls > 100
ORDER BY total_exec_time DESC
LIMIT 10;
-- Find queries with high variance (occasional spikes)
SELECT
left(query, 120) AS query,
round(mean_exec_time::numeric, 2) AS avg_ms,
round(stddev_exec_time::numeric, 2) AS stddev_ms,
round(max_exec_time::numeric, 2) AS max_ms
FROM pg_stat_statements
WHERE calls > 50 AND stddev_exec_time > mean_exec_time
ORDER BY stddev_exec_time DESC
LIMIT 10;
Set up automated alerting when mean_exec_time for high-volume queries exceeds your SLO. A 50ms query that runs 10,000 times per minute contributes 500 seconds of total database time per minute — before any other queries.
Production Considerations
Index Maintenance
Indexes aren't free. Every write pays an index update cost. Over-indexing a write-heavy table hurts throughput. Audit unused indexes regularly:
-- Find indexes that are never used
SELECT schemaname, tablename, indexname, idx_scan
FROM pg_stat_user_indexes
WHERE idx_scan = 0
AND indexname NOT LIKE 'pg_%'
ORDER BY schemaname, tablename;
-- Also check index bloat (indexes grow from updates/deletes)
SELECT
tablename,
indexname,
pg_size_pretty(pg_relation_size(indexrelid)) AS index_size
FROM pg_stat_user_indexes
ORDER BY pg_relation_size(indexrelid) DESC
LIMIT 20;
Creating Indexes Without Downtime
CREATE INDEX in PostgreSQL locks the table for writes. For production tables, always use CONCURRENTLY:
-- This blocks writes until the index is built (avoid on production) CREATE INDEX idx_orders_user_id ON orders(user_id); -- This builds without blocking (takes 2-3× longer but safe for production) CREATE INDEX CONCURRENTLY idx_orders_user_id ON orders(user_id);
CREATE INDEX CONCURRENTLY can fail mid-build. If it does, it leaves an invalid index that must be dropped:
-- Check for invalid indexes after CONCURRENT builds
SELECT indexname FROM pg_indexes
WHERE tablename = 'orders'
AND indexname IN (
SELECT indexrelid::regclass::text FROM pg_index WHERE NOT indisvalid
);
-- Drop and recreate if invalid
DROP INDEX CONCURRENTLY idx_orders_user_id;
CREATE INDEX CONCURRENTLY idx_orders_user_id ON orders(user_id);
Query Planner Statistics and the ANALYZE Command
PostgreSQL's query planner makes decisions based on statistics about your data: how many distinct values a column has, how values are distributed, table row counts. These statistics go stale when large amounts of data are inserted or deleted. Stale statistics cause the planner to make wrong decisions — like choosing a sequential scan when an index would be faster.
-- View current statistics for a table
SELECT
attname AS column,
n_distinct, -- Estimated distinct values (-1 to 1: negative = ratio of rows)
correlation, -- Physical ordering correlation (1.0 = perfectly ordered, 0 = random)
null_frac, -- Fraction of NULL values
avg_width -- Average column value width in bytes
FROM pg_stats
WHERE tablename = 'orders'
ORDER BY attname;
-- Manually update statistics (fast, non-blocking)
ANALYZE orders;
-- ANALYZE with increased statistics for complex distributions
ALTER TABLE orders ALTER COLUMN status SET STATISTICS 500; -- Default is 100 targets
ANALYZE orders(status);
When a query plan suddenly degrades after a large data load, run ANALYZE tablename first. It takes seconds and often fixes the plan without any other changes.
The `SET LOCAL` Pattern for Query Tuning
PostgreSQL allows overriding planner settings per-query for debugging. This is useful for testing whether a different strategy would be faster:
BEGIN; -- Force sequential scan (disable index usage) to compare SET LOCAL enable_indexscan = OFF; SET LOCAL enable_bitmapscan = OFF; EXPLAIN ANALYZE SELECT * FROM users WHERE email = 'test@example.com'; ROLLBACK; -- Revert settings -- Or force a specific join strategy BEGIN; SET LOCAL enable_hashjoin = OFF; SET LOCAL enable_mergejoin = OFF; -- Forces nested loop join — useful when the planner chooses hash join incorrectly EXPLAIN ANALYZE SELECT u.*, o.total FROM users u JOIN orders o ON u.id = o.user_id WHERE u.id = 123; ROLLBACK;
These are diagnostic tools, not production settings. If a plan only works with enable_indexscan = OFF, the fix is updating statistics or hints via pg_hint_plan, not disabling index scans globally.
Conclusion
Query optimization is a skill built on a small number of core techniques used repeatedly:
1. Read EXPLAIN ANALYZE before making any changes — optimize what's actually slow, not what you think is slow
2. Index selectively — high-cardinality columns used in WHERE, JOIN, and ORDER BY; partial indexes for filtered queries
3. Kill N+1 at the ORM level — annotate, eager load, or write the JOIN yourself
4. Avoid index-defeating patterns — functions in WHERE, leading wildcards, low-cardinality indexes
5. Monitor continuously — pg_stat_statements catches regressions before they become incidents
The 44-second query at the start of this guide becomes 312ms with two indexes. That's a 143× improvement. Most database performance problems aren't hard to fix — they're hard to find. The tools above make them findable.
One more habit that separates strong backend engineers from average ones: run EXPLAIN ANALYZE on your application's 10 highest-frequency queries before going to production, not after the first incident. Query plans change as data grows — a plan that looks fine at launch can degrade badly at 100× the initial data. Capturing the baseline plan lets you detect regressions before users do.
Sources
- PostgreSQL documentation: EXPLAIN, pg_stat_statements, indexes
- "Use the Index, Luke" — Markus Winand
- Percona blog: MySQL EXPLAIN output guide
- pganalyze blog: Index and query tuning techniques
Enjoyed this post? Follow AmtocSoft for AI tutorials from beginner to professional.
☕ Buy Me a Coffee | 🔔 YouTube | 💼 LinkedIn | 🐦 X/Twitter
Comments
Post a Comment