Index Tuning Strategy: Access Path Analysis & Diagnostic Workflows #

Index tuning is the discipline of matching physical storage structures to query access patterns so that the PostgreSQL planner consistently selects efficient paths — Index Scans, Bitmap Index Scans, or Index Only Scans — instead of falling back to sequential scans across millions of rows. Done systematically, it compresses execution time, reduces I/O, and lowers write amplification, and it starts not with adding indexes but with reading execution plans to understand what the planner is already doing.


Access Path Taxonomy: How the Planner Chooses Between Scans #

Before tuning any index, map the four access paths PostgreSQL can choose and the conditions under which each appears:

Plan node When the planner selects it Key EXPLAIN field
Seq Scan No usable index, or selectivity too low to justify random I/O rows, cost
Index Scan High selectivity predicate; returns rows in index order Index Cond, Rows Removed by Filter
Bitmap Index Scan + Bitmap Heap Scan Moderate selectivity; batches heap fetches for efficiency Recheck Cond, Exact vs Lossy
Index Only Scan All required columns live in the index; zero heap fetches Heap Fetches (target: 0)

Understanding cost estimation models lets you predict which path the planner will choose at a given selectivity threshold and override that decision safely when statistics diverge from reality.

Minimal annotated EXPLAIN block #

The query below fetches active orders for one customer. After creating a basic index on customer_id, the plan looks like this:

-- Baseline: index on customer_id only
EXPLAIN (ANALYZE, BUFFERS, FORMAT TEXT)
SELECT order_id, status, total
FROM   orders
WHERE  customer_id = 1024
  AND  status      = 'active';
Index Scan using idx_orders_customer on orders
  (cost=0.56..18.42 rows=12 width=24)
  (actual time=0.031..0.089 rows=11 loops=1)
  Index Cond: (customer_id = 1024)        -- only customer_id is in the index
  Filter: (status = 'active')             -- status filtered AFTER the heap fetch
  Rows Removed by Filter: 4              -- 4 heap rows fetched then discarded
Buffers: shared hit=7 read=2             -- 2 disk reads: heap pages not in cache
Planning Time: 0.18 ms
Execution Time: 0.12 ms

Breakdown:


Core Mechanics 1 — B-Tree Index Structure and Range Scan Behaviour #

B-tree indexes are the default in PostgreSQL and the right choice for equality predicates, range scans, and ORDER BY clauses on scalar types. Their internal structure — a balanced tree of sorted keys pointing to heap TIDs — allows the planner to seek directly to the first matching leaf page and then read forward without backtracking.

Composite key column order matters #

The leftmost prefix rule determines which predicates can use a composite B-tree index:

-- Composite index: (customer_id, created_at)
CREATE INDEX idx_orders_cust_created
    ON orders (customer_id, created_at DESC);

-- This query uses BOTH columns in the index:
EXPLAIN (ANALYZE, BUFFERS)
SELECT order_id, total
FROM   orders
WHERE  customer_id = 1024
  AND  created_at  > now() - INTERVAL '30 days';
Index Scan using idx_orders_cust_created on orders
  (cost=0.56..9.13 rows=8 width=16)
  (actual time=0.024..0.051 rows=8 loops=1)
  Index Cond: ((customer_id = 1024)
               AND (created_at > (now() - '30 days'::interval)))
  Buffers: shared hit=4

Both predicates appear under Index Cond — no residual Filter. A query filtering only on created_at (skipping customer_id) cannot use this index efficiently because the leftmost prefix is absent; the planner falls back to a sequential scan or a separate single-column index.

For deep coverage of sort key ordering, fill factor tuning, and CLUSTER-aligned physical layouts, see B-tree index optimization.

Physical heap ordering #

When most reads follow the same range of an index (e.g. always the last 30 days), aligning heap page order to that index reduces random I/O across scattered heap pages:

-- Reorder heap pages to match index order (requires maintenance window)
CLUSTER orders USING idx_orders_cust_created;

-- Recalibrate statistics after physical reorder
ANALYZE orders;

CLUSTER acquires an ACCESS EXCLUSIVE lock for the duration of the rebuild — schedule it during a maintenance window. The ordering is not maintained automatically; subsequent inserts and updates scatter rows again. For online rebuilds that avoid the extended lock, pg_repack reconstructs the table without blocking reads or writes. For write-heavy tables where scatter returns quickly, the gains degrade faster than the operational cost justifies.


Index Access Path Decision Tree A flowchart showing how PostgreSQL selects an index access path based on predicate selectivity, column coverage, and index type. Query predicate WHERE / JOIN condition Usable index exists? (type match, leftmost prefix, no cast) No Seq Scan Yes All SELECT cols in index? (covering / INCLUDE columns) Yes Index Only Scan 0 heap fetches No High selectivity? few rows relative to table size Yes Index Scan No Bitmap Index + Heap Scan

Core Mechanics 2 — Covering Indexes and Index-Only Scans #

The most impactful single change to an index is often converting an Index Scan into an Index Only Scan by embedding the fetched columns into the index itself. The planner can then return query results entirely from index leaf pages without touching the heap.

Adding INCLUDE columns #

-- Before: index on (customer_id, status); fetching total requires a heap fetch
CREATE INDEX idx_orders_cust_status
    ON orders (customer_id, status);

-- After: INCLUDE adds total to the leaf pages without changing the sort key
CREATE INDEX idx_orders_cust_status_covering
    ON orders (customer_id, status)
    INCLUDE (total);
EXPLAIN (ANALYZE, BUFFERS)
SELECT total
FROM   orders
WHERE  customer_id = 1024
  AND  status      = 'active';
Index Only Scan using idx_orders_cust_status_covering on orders
  (cost=0.56..4.11 rows=11 width=8)
  (actual time=0.018..0.032 rows=11 loops=1)
  Index Cond: ((customer_id = 1024) AND (status = 'active'))
  Heap Fetches: 0            -- zero heap reads; all data served from index leaf pages
  Buffers: shared hit=3

Heap Fetches: 0 confirms the plan never touches the heap. If this value is non-zero even with a covering index, the visibility map for those pages is not yet updated — run VACUUM to set all-visible bits and drive heap fetches to zero.

For a detailed build methodology — column ordering inside INCLUDE, trade-offs with index size, and maintenance cost — see covering index design.


Core Mechanics 3 — Partial Indexes and Specialized Structures #

Not every row in a table needs every index. Partial indexes add a WHERE clause to the index definition, so the structure covers only the rows that queries actually read. The index is smaller, fits more readily in shared_buffers, and serves write-intensive tables with lower per-row maintenance cost.

Partial index for a hot subset #

-- Index only the rows that matter for the primary operational query
CREATE INDEX idx_orders_active
    ON orders (customer_id, created_at DESC)
    WHERE status = 'active';
-- The planner uses this index only when the query predicate matches
EXPLAIN (ANALYZE, BUFFERS)
SELECT order_id, created_at
FROM   orders
WHERE  customer_id = 1024
  AND  status      = 'active'
ORDER  BY created_at DESC
LIMIT  10;
Index Scan using idx_orders_active on orders
  (cost=0.42..6.18 rows=10 width=12)
  (actual time=0.016..0.039 rows=10 loops=1)
  Index Cond: (customer_id = 1024)
  Filter: (status = 'active')    -- predicate already guaranteed by index def; rows=0 removed
  Rows Removed by Filter: 0
  Buffers: shared hit=4

The index definition’s WHERE status = 'active' guarantees that only active rows are stored; the planner’s residual Filter produces zero removals. The index is orders of magnitude smaller than a full-table index on the same columns, and the buffer cache must hold fewer pages.

For detailed predicate alignment rules and soft-delete patterns, see partial index implementation.

Specialized index types for non-scalar data #

B-trees cannot efficiently index full-text, JSONB, array, or geometric data. PostgreSQL’s GIN and GiST structures are designed for these multi-valued or high-dimensional types:

-- GIN index for full-text search on a documents table
CREATE INDEX idx_docs_fts
    ON documents USING gin(to_tsvector('english', body));

-- GiST index for geospatial containment queries
CREATE INDEX idx_locations_geom
    ON locations USING gist(geom);

When examining GIN plans in EXPLAIN, look for Bitmap Index Scan using idx_docs_fts nodes; the Recheck Cond on the parent Bitmap Heap Scan indicates whether the GIN result is exact or needs revalidation. For the full taxonomy of when to prefer GIN versus GiST and their respective maintenance costs, see specialized index types (GIN/GiST).


Systematic Diagnostic Workflow #

Apply these steps in sequence for every index tuning investigation. Each step produces a concrete artifact (a plan, a ratio, a list) that feeds the next.

1. Capture a baseline plan with full buffer reporting.

EXPLAIN (ANALYZE, BUFFERS, FORMAT TEXT)
SELECT ...
FROM   your_table
WHERE  ...;

Save the output. You need it for the before/after comparison in step 6.

2. Identify the costliest node.

Look for the plan node with the highest actual time value (shown as actual time=startup..total). In nested plans, the outer node’s time is cumulative — find the innermost expensive node.

3. Verify row estimate accuracy.

-- Compare planner estimate vs actual for the hot node's table
SELECT tablename,
       attname,
       n_distinct,
       correlation,
       null_frac
FROM   pg_stats
WHERE  tablename = 'your_table'
  AND  attname   IN ('customer_id', 'status');

A plan rows estimate that differs from actual rows by more than 10× signals stale or insufficient statistics. Run ANALYZE your_table to refresh, or increase the statistics target for skewed columns:

ALTER TABLE your_table
    ALTER COLUMN status SET STATISTICS 400;
ANALYZE your_table;

Understanding sequential vs index scan selection thresholds helps calibrate when row estimate errors flip the planner’s access path choice.

4. Audit buffer hit ratios.

From the Buffers line in EXPLAIN (ANALYZE, BUFFERS):

Buffers: shared hit=N read=M

Hit ratio = hit / (hit + read). Target >90 % for OLTP queries. A low ratio under a well-constructed index usually indicates that heap pages are scattered across many blocks — physical reordering or a covering index that avoids the heap entirely is the appropriate fix.

5. Map predicates to index columns.

-- List existing indexes for the table
SELECT indexname, indexdef
FROM   pg_indexes
WHERE  tablename = 'your_table';

Check every WHERE predicate and JOIN condition against the listed index columns. Predicates that appear as Filter (not Index Cond) in the plan are candidates for composite index extension. Watch for implicit type casts — a predicate like WHERE id = '1024' (varchar literal against integer column) bypasses the index entirely.

6. Apply one change, measure, compare.

Create the candidate index concurrently (to avoid locking production):

CREATE INDEX CONCURRENTLY idx_candidate
    ON your_table (predicate_col1, predicate_col2)
    INCLUDE (fetch_col);

Re-run the baseline query with EXPLAIN (ANALYZE, BUFFERS) and compare actual time, Buffers: read, and Rows Removed by Filter against step 1’s output. One change per iteration keeps attribution unambiguous.

7. Confirm adoption with pg_stat_user_indexes.

SELECT indexname,
       idx_scan,
       idx_tup_read,
       idx_tup_fetch
FROM   pg_stat_user_indexes
WHERE  relname = 'your_table'
ORDER  BY idx_scan DESC;

Common Pitfalls #

1. Redundant composite overlap Diagnostic: Two indexes share a leading column prefix (e.g. (customer_id) and (customer_id, status)). The narrower one is consumed by the broader — it wastes storage and adds DML overhead without enabling any new access path. Fix: Drop the narrower index; verify the composite serves all queries that used the single-column one.

2. Implicit type cast bypass Diagnostic: EXPLAIN shows Filter: ((id)::text = '1024') — a function applied to the column, preventing index use. Fix: Align the literal type with the column type: WHERE id = 1024 (integer literal, not string).

3. Stale statistics forcing sequential scans Diagnostic: plan rows is orders of magnitude from actual rows; planner chooses Seq Scan. Fix: Run ANALYZE table_name. For chronically skewed columns, increase SET STATISTICS and add pg_stat_statements monitoring to catch regressions early.

4. Partial index predicate mismatch Diagnostic: The query filter does not literally match the index WHERE clause. Even semantically equivalent expressions (e.g. status <> 'deleted' vs status IN ('active','pending')) prevent the planner from recognising the index as applicable. Fix: Rewrite the query predicate to exactly match the index WHERE clause, or create a separate partial index for the query’s specific predicate form.

5. High write amplification from index sprawl Diagnostic: pg_stat_bgwriter shows high buffers_checkpoint values; WAL volume grows faster than data volume; pg_stat_user_indexes reveals several indexes with idx_scan near zero on high-DML tables. Fix: Audit and drop unused indexes. Each removed index directly reduces WAL per INSERT/UPDATE/DELETE. The practical OLTP ceiling is 5–7 indexes per heavily-written table.

6. Visibility map not current — non-zero Heap Fetches on Index Only Scan Diagnostic: Plan shows Index Only Scan but Heap Fetches is greater than zero. Fix: Run VACUUM your_table to update the visibility map. Ensure autovacuum is running and not lagging behind write volume.


Frequently Asked Questions #

How do I determine if an index is actually being used by the query planner? #

Run EXPLAIN (ANALYZE, BUFFERS) and look for Index Scan, Bitmap Index Scan, or Index Only Scan nodes that name the index in question. Confirm ongoing usage with pg_stat_user_indexes: idx_scan should increment across normal query traffic. If the planner still selects a sequential scan, the predicate selectivity may be too low for the planner’s cost model, or statistics are stale — run ANALYZE and re-examine. For a detailed treatment of the selectivity threshold, see sequential vs index scans.

When should I prioritise covering indexes over standard B-tree structures? #

Use a covering index when the same narrow column set is fetched repeatedly and heap fetches dominate execution time. The INCLUDE clause stores payload columns in the leaf pages without adding them to the sort key, eliminating the Index Scan → heap fetch round-trip. Confirm the benefit by checking Heap Fetches: 0 in EXPLAIN (ANALYZE, BUFFERS) and comparing idx_tup_read vs idx_tup_fetch in pg_stat_user_indexes. Full build methodology is in covering index design.

How does physical table ordering affect index performance? #

When heap rows are physically arranged to match an index’s sort order, range scans read contiguous pages sequentially rather than jumping across scattered heap blocks, reducing Buffers: read counts. The ordering degrades over time as new writes scatter rows. On write-heavy tables, re-clustering frequently enough to maintain the benefit may not be practical; a covering index that avoids heap reads entirely is a more durable alternative.

What signals in EXPLAIN output indicate stale statistics are causing a bad plan? #

A plan rows value that differs from actual rows by more than 10× is the primary signal. When this discrepancy accompanies a Seq Scan on a large table or a deeply nested loop, the planner is making the wrong access-path decision because it believes fewer or more rows will be produced than actually are. Run ANALYZE on the affected table to rebuild histograms. For chronically skewed distributions — such as a status column where 98 % of rows share one value — raise default_statistics_target on that column and re-analyse.

How many indexes per table is too many? #

There is no hard upper bound, but each additional index adds WAL write overhead and slows every INSERT, UPDATE, and DELETE on the table proportional to index count and the average number of modified index entries per row. A practical ceiling for OLTP tables with high write rates is 5–7 indexes. Audit pg_stat_user_indexes on a regular schedule — any index with idx_scan = 0 that is not enforcing a uniqueness constraint or serving a scheduled maintenance query is a removal candidate.


Back to home