Understanding Query Performance: A Deep Dive into EXPLAIN ANALYZE

May 7, 2025

—

When your database starts to slow down, diagnosing the exact source of performance issues can be challenging. Fortunately, PostgreSQL provides a powerful tool for this purpose: EXPLAIN ANALYZE. This command offers crucial insights into how PostgreSQL executes your queries, helping you identify bottlenecks and opportunities for optimization.

What is `EXPLAIN ANALYZE`?

The EXPLAIN ANALYZE command shows you the execution plan that PostgreSQL’s query planner generates for your SQL statements, along with actual runtime statistics. This provides a comprehensive view of:

How the database searches for your data
Which indexes (if any) are being used
How tables are being joined
The actual time spent on each operation
How many rows are processed at each step

Basic Usage

The syntax is straightforward:

EXPLAIN ANALYZE
SELECT * FROM users WHERE email = 'john@example.com';

The output might look like this:

QUERY PLAN
-----------------------------------------------------------------
Index Scan using idx_user_email on users  (cost=0.29..8.31 rows=1 width=72) (actual time=0.044..0.046 rows=1 loops=1)
  Index Cond: (email = 'john@example.com'::text)
Planning Time: 0.092 ms
Execution Time: 0.075 ms

Understanding the Output

Let’s break down the components of an EXPLAIN ANALYZE output:

1. Operation Types

These are the fundamental operations PostgreSQL performs:

Seq Scan: Full table scan, reading every row sequentially
Index Scan: Using an index to find specific rows
Index Only Scan: Using only the index without accessing the table
Bitmap Scan: Two-phase scan using an index bitmap
Nested Loop/Hash/Merge Join: Different join algorithms
Sort: Sorting operations for ORDER BY clauses
Aggregate: Calculations for functions like COUNT, SUM, AVG
Limit: Restricting number of rows returned

2. Cost Estimates vs. Actual Times

(cost=0.29..8.31 rows=1 width=72) (actual time=0.044..0.046 rows=1 loops=1)

This shows:

Startup cost (0.29): Estimated cost before first row is returned
Total cost (8.31): Estimated cost to return all rows
Estimated rows (1): Expected number of rows
Width (72): Estimated average row width in bytes
Actual time (0.044..0.046): Actual startup and total time (in milliseconds)
Actual rows (1): Actual number of rows returned
Loops (1): Number of times this operation was performed

3. Planning Time and Execution Time

Planning Time: 0.092 ms
Execution Time: 0.075 ms

Planning Time: How long PostgreSQL spent creating the execution plan
Execution Time: Actual time to execute the query

Reading Execution Plans: A Hierarchical Structure

Execution plans are read from the innermost (most indented) operations to the outermost ones. Each operation feeds its results to the operation above it.

Example:

QUERY PLAN
-----------------------------------------------------------------
Hash Join  (cost=10.57..21.14 rows=10 width=73) (actual time=0.136..0.138 rows=10 loops=1)
  Hash Cond: (orders.customer_id = customers.id)
  ->  Seq Scan on orders  (cost=0.00..8.50 rows=350 width=28) (actual time=0.011..0.013 rows=20 loops=1)
  ->  Hash  (cost=9.50..9.50 rows=10 width=45) (actual time=0.083..0.083 rows=10 loops=1)
        ->  Seq Scan on customers  (cost=0.00..9.50 rows=10 width=45) (actual time=0.008..0.009 rows=10 loops=1)
              Filter: (status = 'active'::text)

Reading order:

Seq Scan on customers with a filter
Hash creation from customers data
Seq Scan on orders table
Hash Join combining both using the specified condition

Common Performance Issues to Look For

1. Sequential Scans on Large Tables

Seq Scan on users (cost=0.00..458.00 rows=10000 width=182) (actual time=0.028..8.312 rows=10000 loops=1)

This indicates PostgreSQL is reading every row in the users table. For large tables, this suggests you might need an index.

2. Significantly Wrong Row Estimates

When estimated rows are vastly different from actual rows, it indicates outdated statistics:

Index Scan using idx_email on users (cost=0.29..8.31 rows=1 width=72) (actual time=0.044..12.046 rows=10000 loops=1)

Solution: Run ANALYZE on your table to update statistics.

3. Inefficient Join Methods

Hash joins are typically fast, but nested loops on large datasets can be problematic:

Nested Loop (cost=0.29..2844.29 rows=1000 width=68) (actual time=0.068..425.072 rows=10000 loops=1)

Consider adding indexes on join columns or increasing work_mem parameter.

4. High Startup Times

If planning time is unusually high, you might have complex queries or triggers affecting performance:

Planning Time: 230.092 ms

5. Multiple Passes (Loops > 1)

When loops are greater than 1, PostgreSQL is executing the same operation multiple times:

Index Scan using idx_order_id on order_items (cost=0.29..8.31 rows=5 width=72) (actual time=0.044..0.046 rows=5 loops=1000)

This might indicate opportunities for better join strategies or materialization.

Advanced Features

Buffers Information

Add BUFFERS to see I/O statistics:

EXPLAIN (ANALYZE, BUFFERS) SELECT * FROM users WHERE email = 'john@example.com';

Output includes:

Buffers: shared hit=4 read=1

This shows how many buffer blocks were:

Hit in cache (faster)
Read from disk (slower)

Format Options

For more readable output:

EXPLAIN (ANALYZE, FORMAT JSON) SELECT * FROM users WHERE email = 'john@example.com';

Available formats: TEXT (default), XML, JSON, YAML

Real-World Optimization Examples

Example 1: Adding Missing Index

Original query:

EXPLAIN ANALYZE SELECT * FROM orders WHERE customer_email = 'john@example.com';

Seq Scan on orders (cost=0.00..2230.00 rows=42 width=120) (actual time=0.536..15.478 rows=38 loops=1)
  Filter: (customer_email = 'john@example.com'::text)
  Rows Removed by Filter: 49962
Planning Time: 0.088 ms
Execution Time: 15.502 ms

After adding index:

CREATE INDEX idx_orders_customer_email ON orders(customer_email);

New execution plan:

Index Scan using idx_orders_customer_email on orders (cost=0.29..8.45 rows=38 width=120) (actual time=0.028..0.152 rows=38 loops=1)
  Index Cond: (customer_email = 'john@example.com'::text)
Planning Time: 0.092 ms
Execution Time: 0.183 ms

Performance improved from 15.5ms to 0.18ms!

Example 2: Optimizing Joins

Original query with nested loop join:

EXPLAIN ANALYZE
SELECT o.*, c.name 
FROM orders o
JOIN customers c ON o.customer_id = c.id
WHERE o.status = 'completed';

Nested Loop (cost=0.29..5286.29 rows=5000 width=124) (actual time=0.056..220.391 rows=5000 loops=1)
  ->  Seq Scan on orders o (cost=0.00..180.00 rows=5000 width=84) (actual time=0.014..12.014 rows=5000 loops=1)
        Filter: (status = 'completed'::text)
  ->  Index Scan using customers_pkey on customers c (cost=0.29..1.01 rows=1 width=40) (actual time=0.039..0.041 rows=1 loops=5000)
        Index Cond: (id = o.customer_id)
Planning Time: 0.354 ms
Execution Time: 220.638 ms

After increasing work_mem:

SET work_mem = '16MB';

New execution plan:

Hash Join (cost=140.00..298.00 rows=5000 width=124) (actual time=1.464..8.834 rows=5000 loops=1)
  Hash Cond: (o.customer_id = c.id)
  ->  Seq Scan on orders o (cost=0.00..180.00 rows=5000 width=84) (actual time=0.010..1.980 rows=5000 loops=1)
        Filter: (status = 'completed'::text)
  ->  Hash (cost=77.00..77.00 rows=5000 width=40) (actual time=1.436..1.436 rows=5000 loops=1)
        Buckets: 8192  Batches: 1  Memory Usage: 421kB
        ->  Seq Scan on customers c (cost=0.00..77.00 rows=5000 width=40) (actual time=0.008..0.701 rows=5000 loops=1)
Planning Time: 0.321 ms
Execution Time: 9.586 ms

Performance improved from 220ms to 9.5ms!

Best Practices for Using `EXPLAIN ANALYZE`

Test on representative data: Query plans can change dramatically based on table size and data distribution
Test multiple times: The first execution may load data into cache, affecting subsequent runs
Be careful in production: EXPLAIN ANALYZE actually executes the query, so avoid using it on:
- INSERT, UPDATE, or DELETE statements
- Very slow queries that might impact production
Compare plans before and after changes: Keep records of execution plans before making changes
Focus on the highest-cost operations first: Don’t optimize operations that already run quickly

Conclusion

Mastering EXPLAIN ANALYZE is an essential skill for database performance tuning. By understanding execution plans, you can identify bottlenecks, validate optimization efforts, and ensure your database operates at peak efficiency.

Remember that optimization is iterative:

Identify slow queries
Analyze execution plans
Make targeted improvements
Measure results
Repeat

While EXPLAIN ANALYZE provides valuable insights, it should be paired with other PostgreSQL performance tools like pg_stat_statements, auto_explain, and proper monitoring to build a comprehensive performance optimization strategy.

What slow queries are you currently struggling with? Try using EXPLAIN ANALYZE to uncover optimization opportunities, and you might be surprised by how much performance you can unlock with the right approach.

Understanding Query Performance: A Deep Dive into EXPLAIN ANALYZE

What is `EXPLAIN ANALYZE`?

Basic Usage

Understanding the Output

1. Operation Types

2. Cost Estimates vs. Actual Times

3. Planning Time and Execution Time

Reading Execution Plans: A Hierarchical Structure

Common Performance Issues to Look For

1. Sequential Scans on Large Tables

2. Significantly Wrong Row Estimates

3. Inefficient Join Methods

4. High Startup Times

5. Multiple Passes (Loops > 1)

Advanced Features

Buffers Information

Format Options

Real-World Optimization Examples

Example 1: Adding Missing Index

Example 2: Optimizing Joins

Best Practices for Using `EXPLAIN ANALYZE`

Conclusion

Comments

Leave a Reply Cancel reply

Understanding Query Performance: A Deep Dive into EXPLAIN ANALYZE

What is EXPLAIN ANALYZE?

Basic Usage

Understanding the Output

1. Operation Types

2. Cost Estimates vs. Actual Times

3. Planning Time and Execution Time

Reading Execution Plans: A Hierarchical Structure

Common Performance Issues to Look For

1. Sequential Scans on Large Tables

2. Significantly Wrong Row Estimates

3. Inefficient Join Methods

4. High Startup Times

5. Multiple Passes (Loops > 1)

Advanced Features

Buffers Information

Format Options

Real-World Optimization Examples

Example 1: Adding Missing Index

Example 2: Optimizing Joins

Best Practices for Using EXPLAIN ANALYZE

Conclusion

Comments

Leave a Reply Cancel reply

What is `EXPLAIN ANALYZE`?

Best Practices for Using `EXPLAIN ANALYZE`