Filter Pushdown Mechanics in Database Execution Plans #
Filter pushdown is the optimizer’s technique of relocating WHERE predicates as close to the data retrieval layer as possible, so the fewest rows travel up the execution tree to reach join, aggregate, or sort operators.
When the Optimizer Pushes a Predicate Down #
The cost-based optimizer decides predicate placement by combining table statistics, index availability, and predicate selectivity. This decision is central to all of Reading & Interpreting Query Plans — an engineer who can read predicate placement instantly knows whether the index is being used effectively. The optimizer evaluates three levels:
- Index condition (
Index Cond) — evaluated entirely within the index structure. Only matching TIDs are returned to the executor. This is the most efficient placement: data pages that cannot satisfy the predicate are never read. - Storage-level filter (
Filteron the scan node) — embedded in aSeq ScanorIndex Scannode. The predicate is tested row-by-row as pages are read. The scan node reportsRows Removed by Filterto show the discard volume. - Deferred filter (
Filternode above the scan) — the scan returns every row it can, and a parent operator node discards non-matching rows. This placement means the predicate is either non-sargable, references a non-indexed expression, or cannot be evaluated inside the index structure.
Effective pushdown requires predicates that:
- Reference indexed columns without function wrappers or expression transforms.
- Use operators the index type supports (
=,<,>,<=,>=for B-trees;&&,@>for GIN/GiST as covered in specialized index types). - Carry matching data types — no implicit cast between the column type and the literal.
The optimizer also weighs cardinality. A predicate the planner estimates will eliminate 90% of rows is a strong pushdown candidate; a low-selectivity predicate on a small table may not justify an index scan at all. If your cardinality estimates are wrong, the optimizer may place the predicate at the wrong level — checking rows vs actual rows identifies this quickly.
Annotated EXPLAIN Node Breakdown #
Run EXPLAIN (ANALYZE, BUFFERS) and read the scan node to determine which pushdown level is active.
-- Full pushdown: predicate evaluated inside the index
EXPLAIN (ANALYZE, BUFFERS)
SELECT order_id, created_at
FROM orders
WHERE created_at >= '2024-01-01'
AND created_at < '2024-04-01';
Index Scan using idx_orders_created_at on orders
(cost=0.42..394.10 rows=1312 width=16)
(actual time=0.041..9.812 rows=1289 loops=1)
Index Cond: ((created_at >= '2024-01-01') AND (created_at < '2024-04-01'))
-- ↑ predicate evaluated inside the B-tree; only matching TIDs returned
Buffers: shared hit=147 read=0
-- ↑ zero read-miss pages: range eliminates most of the table
Planning Time: 0.312 ms
Execution Time: 10.241 ms
Key fields to inspect:
| Field | What it means |
|---|---|
Index Cond |
Predicate evaluated inside the index — best placement |
Filter on scan node |
Predicate evaluated per row during the scan — partial pushdown |
Rows Removed by Filter |
Rows discarded after retrieval — quantifies wasted work |
Buffers: shared read |
Pages fetched from disk — should be low when pushdown is effective |
actual rows vs rows |
Cardinality accuracy — large divergence can cause wrong placement |
Compare this to a deferred filter:
-- Deferred: function wrapper prevents index traversal
EXPLAIN (ANALYZE, BUFFERS)
SELECT order_id, created_at
FROM orders
WHERE EXTRACT(YEAR FROM created_at) = 2024;
Seq Scan on orders
(cost=0.00..44200.00 rows=1640 width=16)
(actual time=0.842..389.441 rows=1289 loops=1)
Filter: (EXTRACT(year FROM created_at) = '2024'::numeric)
-- ↑ standalone Filter node: all heap pages read before predicate applies
Rows Removed by Filter: 48711
-- ↑ 97% of rows fetched were discarded — extreme I/O waste
Buffers: shared hit=2100 read=10800
Planning Time: 0.198 ms
Execution Time: 390.114 ms
The EXTRACT wrapper makes the predicate non-sargable. PostgreSQL cannot traverse the index for an arbitrary function result, so every heap page is read.
Algorithm Internals: How Pushdown Decisions Are Made #
Sargability #
A predicate is sargable (Search ARGument ABLE) when the planner can directly compare a column value against a constant using an operator the index provides. The moment a function wraps the column — EXTRACT(YEAR FROM created_at), LOWER(email), status::text — the optimizer loses the ability to navigate the index for that expression and the predicate becomes non-sargable.
The rewrite pattern is consistent: expand the function result back to an index-friendly range on the raw column.
Before (non-sargable):
WHERE EXTRACT(YEAR FROM created_at) = 2024
-- Wraps the column; no index traversal possible
After (sargable range predicate):
WHERE created_at >= '2024-01-01'
AND created_at < '2025-01-01'
-- Raw column comparison; Index Cond applies
The optimizer now sees a range predicate that the B-tree index can satisfy directly. EXPLAIN (ANALYZE, BUFFERS) will show Index Cond instead of Filter.
Composite Index Column Order and Partial Pushdown #
When a composite index exists, PostgreSQL can only push a predicate down using the index if the predicate references the leading columns of the index in order. A predicate on a trailing column alone cannot use the index efficiently — the leading column values are unknown, so the index cannot be entered at the right point.
-- Index: (region_id, status)
-- Predicate references the leading column first — full Index Cond possible
EXPLAIN (ANALYZE, BUFFERS)
SELECT * FROM users WHERE region_id = 42 AND status = 'active';
Index Scan using idx_users_region_status on users
(cost=0.42..185.20 rows=34 width=192)
(actual time=0.031..0.412 rows=28 loops=1)
Index Cond: (region_id = 42)
-- ↑ leading column pushed to Index Cond
Filter: (status = 'active'::text)
-- ↑ trailing column filtered after index entry retrieval
Rows Removed by Filter: 156
-- ↑ 156 rows matched region_id=42 but failed the status check
Buffers: shared hit=184
region_id = 42 reaches Index Cond but status remains a scan-level Filter. Reversing the index to (status, region_id) pushes both predicates to Index Cond when status has higher selectivity — this choice is part of the broader B-tree index optimization decision.
Type Coercion and ORM Parameters #
ORM frameworks frequently bind parameters as strings or generic objects, which forces an implicit cast at runtime. When the cast changes the column’s effective type, PostgreSQL cannot match the predicate to the index definition and falls back to a sequential scan or a deferred filter.
Type mismatch (broken pushdown):
-- ORM sends category_id as a varchar parameter
SELECT * FROM products WHERE category_id = '14';
-- PostgreSQL must cast '14'::varchar to integer at evaluation time
-- This implicit cast may not align with the index definition
Explicit cast (pushdown restored):
-- Force the literal to the correct type before plan compilation
SELECT * FROM products WHERE category_id = CAST($1 AS integer);
Index Scan using idx_products_category on products
(cost=0.42..8.44 rows=1 width=128)
(actual time=0.015..0.018 rows=1 loops=1)
Index Cond: (category_id = 14)
-- ↑ integer comparison; no implicit cast; full pushdown achieved
Buffers: shared hit=3
Memory, I/O, and Resource Behavior #
Pushdown directly controls I/O volume. When a predicate reaches Index Cond, the executor reads only the heap pages containing matching TIDs. When pushdown fails and a Filter node sits above a Seq Scan, every page in the relation is read regardless of selectivity.
Buffer hit ratio is the primary signal. With effective pushdown, Buffers: shared hit dominates and shared read (disk fetches) is minimal. With deferred filtering, shared read is proportional to total table size rather than result size.
Pushdown failures also create downstream resource pressure:
- Sort nodes receive more rows than necessary, increasing
work_memdemand and raising the risk of a temporary file spill. Sort and hash node analysis covers how to read spill indicators in the plan. - Hash join build phases allocate hash tables sized to the pre-filter row count rather than the post-filter count. Excess rows force additional
Batchesand additional I/O. - Aggregate nodes accumulate unnecessary groups, increasing CPU cost.
Recheck Cond is a related signal specific to lossy bitmap scans. When BitmapAnd or BitmapOr nodes combine results, individual heap pages may be fetched without knowing which rows on that page matched. PostgreSQL re-evaluates the predicate on each tuple after the page fetch — this shows as Recheck Cond in the plan. High Heap Fetches alongside Recheck Cond indicates that the bitmap scan is lossy and that a direct Index Scan may be cheaper. For a detailed treatment of this boundary, see Understanding Filter vs Recheck Conditions.
Step-by-Step Tuning Workflow #
Follow these steps in order when a query shows excessive Filter rows or high buffer reads.
Step 1 — Capture the baseline plan
EXPLAIN (ANALYZE, BUFFERS)
<your query here>;
-- Save the full output before making changes
Look for: Filter nodes above scan nodes, Rows Removed by Filter counts, and high shared read values. If actual rows diverges from rows by more than a factor of 2, statistics may be stale.
Step 2 — Refresh statistics on the affected table
ANALYZE orders;
-- Re-run EXPLAIN (ANALYZE, BUFFERS) to see if plan changes
Stale statistics cause cardinality misestimates. The optimizer may choose a sequential scan because it underestimates the result set size, which then prevents pushdown from being cost-effective.
Step 3 — Inspect predicate sargability
Check each WHERE clause predicate for function wrappers around indexed columns. Query pg_stats to confirm the column has useful statistics:
SELECT tablename, attname, n_distinct, correlation, most_common_vals
FROM pg_stats
WHERE tablename = 'orders'
AND attname = 'created_at';
-- correlation near 1.0 means physical order matches logical order
-- favors index scans over sequential scans
Step 4 — Verify the index definition matches the predicate
SELECT indexname, indexdef
FROM pg_indexes
WHERE tablename = 'orders';
-- Confirm the column and operator class in the index match your predicate
A predicate on created_at needs an index that includes created_at without function transforms. If the index was created with EXTRACT(YEAR FROM created_at), it is a function-based index that only satisfies that exact expression — it will not satisfy a range predicate on the raw column.
Step 5 — Rewrite non-sargable predicates
Convert function-wrapped column references to raw-column range predicates. Replace implicit cast patterns with explicit CAST() or typed literals.
Step 6 — Verify composite index column order
If partial pushdown is occurring (some predicates reach Index Cond, others remain as Filter), query the index definition and compare column order against predicate selectivity:
SELECT a.attname, a.attnum
FROM pg_index i
JOIN pg_attribute a ON a.attrelid = i.indrelid
AND a.attnum = ANY(i.indkey)
WHERE i.indexrelid = 'idx_users_region_status'::regclass
ORDER BY a.attnum;
-- Highest-selectivity column should be position 1
Then run EXPLAIN (ANALYZE, BUFFERS) again to confirm both predicates appear under Index Cond.
Step 7 — Add a partial index for boolean and status columns
When a query consistently filters on a low-cardinality status column alongside a high-cardinality column, a partial index eliminates the low-cardinality filter entirely:
CREATE INDEX idx_orders_active_created
ON orders (created_at)
WHERE status = 'active';
-- The WHERE clause is implicit; status never needs a runtime Filter
This is covered in detail under partial index implementation.
Common Pitfalls #
Function wrappers on indexed columns. Wrapping a column in LOWER(), DATE_TRUNC(), EXTRACT(), or any other function creates a non-sargable predicate and prevents index traversal. The Filter node above the scan is the diagnostic signal. Fix: rewrite the predicate to use raw column comparisons or create a function-based index on the exact expression.
Implicit type coercion from ORM parameters. ORMs bound to a generic string type often cause PostgreSQL to cast at runtime. The plan shows a Seq Scan where an index scan is expected, or a Filter node where Index Cond should appear. Fix: use explicit CAST() or typed literals in ORM query builders.
Composite index column order mismatch. If the predicate targets a trailing index column without the leading column, the optimizer cannot efficiently enter the index. EXPLAIN shows Rows Removed by Filter against a number much larger than the result set. Fix: add the leading column to the predicate, or reorder the index to put the highest-selectivity predicate column first.
OR conditions across multiple columns. WHERE a = 1 OR b = 2 cannot satisfy a standard B-tree index on either column alone. The planner may use a BitmapOr of two index scans, but only if separate indexes exist for both columns. Without them, the result is a sequential scan with a deferred filter. Fix: use UNION ALL to split the branches, or ensure individual indexes cover each OR branch so BitmapOr applies.
Ignoring Recheck Cond overhead in bitmap scans. Lossy bitmap scans report Recheck Cond and Heap Fetches. High Heap Fetches means many pages were fetched to evaluate rows that ultimately failed the recheck. Fix: review whether a direct Index Scan is cheaper, or increase work_mem to reduce bitmap lossiness. Full context is in Understanding Filter vs Recheck Conditions.
Disabling index scans globally. Using SET enable_seqscan = off to force index usage globally can cause index scans on tables where sequential scans are genuinely cheaper, and masks the root cause. Fix: diagnose the sargability or statistics issue directly; use SET LOCAL inside a transaction for isolated plan testing only.
Frequently Asked Questions #
Why does my index scan still show a separate Filter node? #
The optimizer pushed the most selective predicate to the Index Cond. Secondary predicates that cannot evaluate at the B-tree level — due to column order in the composite index, data type mismatch, or expression wrapping — remain as a Filter node applied after index entry retrieval. Reordering the composite index to put the highest-selectivity column first, or rewriting the predicate to remove the expression wrapper, usually eliminates the standalone Filter.
Can filter pushdown work with JOIN operations? #
Yes. The optimizer evaluates whether a predicate references columns from a single table and, when it does, relocates that predicate to that table’s scan node so it executes before the join operator. Reducing the row count that enters the join minimizes memory allocation for hash buffers or the sort input required for merge join operations.
How do I confirm a predicate has been pushed down? #
Run EXPLAIN (ANALYZE, BUFFERS) and look for Index Cond on the scan node rather than a standalone Filter node above it. Index Cond means the predicate was evaluated inside the index structure. A Filter line on the scan node means it was applied row-by-row during the scan. A separate Filter node above the scan means evaluation was deferred entirely. The Rows Removed by Filter counter quantifies how many rows were fetched and then discarded.
How do I force the optimizer to push a filter down? #
Ensure predicates are sargable — no function wrappers around indexed columns, matching data types, and operators the index type supports. Run ANALYZE to refresh statistics. Verify the index column order aligns with your highest-selectivity predicates. Planner configuration overrides such as SET enable_seqscan = off should be used only in diagnostic sessions, not in production.
Related #
- Understanding Filter vs Recheck Conditions — the exact boundary between storage-level filter and executor-level recheck in bitmap scans
- Identifying Plan Bottlenecks — locating the high-cost operator node before diagnosing its cause
- Sort and Hash Node Analysis — how deferred filters inflate sort and hash inputs downstream
- B-tree Index Optimization — composite index column ordering and operator class selection
- Partial Index Implementation — eliminating low-cardinality status filters via index-level WHERE clauses