
In a previous discussion, we explored what CTEs and recursive CTEs are and how they help traverse hierarchical data such as:
- Organization structures
- Folder trees
- Category hierarchies
- Graph relationships
However, many engineers stop there.
To truly master recursive queries, you must understand how PostgreSQL executes them internally and how to optimize them for performance.
In this article we’ll explore:
- PostgreSQL’s internal execution model
- The working table and intermediate table
- How recursion actually loops internally
- Performance optimization strategies
- Preventing infinite recursion
- Indexing strategies for recursive queries
How PostgreSQL Internally Executes Recursive CTEs
Recursive CTEs are not executed as true recursion like function calls in programming languages.
Instead, PostgreSQL converts the recursive query into an iterative process that repeatedly processes result sets.
Internally, PostgreSQL uses two temporary structures:
- Working Table
- Intermediate Table
Understanding these two tables is the key to understanding recursive execution.
The Three Core Components of Recursive Execution
Every recursive CTE consists of:
1️Anchor Query (Base Query)
2️ Recursive Query
3️ Termination Condition
Example:
WITH RECURSIVE employee_hierarchy AS (
-- Anchor Query
SELECT id, name, manager_id, 0 AS level
FROM employees
WHERE id = 2
UNION ALL
-- Recursive Query
SELECT e.id, e.name, e.manager_id, h.level + 1
FROM employee_hierarchy h
JOIN employees e
ON e.manager_id = h.id
)
SELECT * FROM employee_hierarchy;
Now let’s see how PostgreSQL executes it internally.
Internal Execution Flow
Internally PostgreSQL executes the recursive CTE in iterations.
The process looks like this:
Anchor Query
↓
Working Table
↓
Recursive Query
↓
Intermediate Table
↓
Append to Result
↓
Repeat until empty
Let’s break this down step-by-step.
Step 1 — Execute the Anchor Query
PostgreSQL starts by executing the anchor query.
SELECT id, name, manager_id
FROM employees
WHERE id = 2
Example result:

This result is stored in two places:
1️ Result Table
2️ Working Table
Result Table
-------------
2
Working Table
-------------
2
Step 2 — Run the Recursive Query Using the Working Table
The recursive query is executed using the working table as input.
SELECT e.id
FROM employees e
JOIN working_table w
ON e.manager_id = w.id
Suppose this produces:

These results go into the Intermediate Table.
Intermediate Table
------------------
4
5
Step 3 — Move Intermediate Table to Working Table
PostgreSQL then:
1️ Appends the intermediate rows to the Result Table
2️ Replaces the Working Table with the intermediate results
Result Table
-------------
2
4
5
Working Table
-------------
4
5
The intermediate table is then cleared.
Step 4 — Repeat the Recursive Query
The recursive query runs again using the new working table.
Working Table = 4,5
Query:
SELECT employees WHERE manager_id IN (4,5)
If new rows exist, they go to the intermediate table again.
Example:
Intermediate Table
------------------
7
8
Step 5 — Continue Until No Rows Are Returned
The loop continues until:
Intermediate Table = empty
Once no new rows are generated, recursion stops.
The final result is returned from the Result Table.
Visual Execution Model
Iteration 1
-----------
Anchor Query
Working Table = [2]
Iteration 2
-----------
Recursive Query
Intermediate Table = [4,5]
Working Table = [4,5]
Iteration 3
-----------
Recursive Query
Intermediate Table = [7,8]
Working Table = [7,8]
Iteration 4
-----------
Recursive Query
Intermediate Table = []
Stop recursion
Why PostgreSQL Uses This Approach
This iterative model has several advantages:
1️ Prevents Stack Overflow
True recursion could cause deep call stacks.
PostgreSQL instead uses set-based iteration.
2️ Works Efficiently With SQL Execution Engine
PostgreSQL is optimized for set processing, not procedural recursion.
3️ Allows Query Planner Optimization
The planner can optimize joins and filters at each iteration.
Performance Optimization for Recursive Queries
Recursive queries can become very expensive if the hierarchy is large.
Here are the most important optimization techniques.
1. Proper Indexing
The most critical index is on the parent relationship column.
Example table:
employees
---------
id
manager_id
Create this index:
CREATE INDEX idx_employees_manager
ON employees(manager_id);
Why?
Each recursive step performs:
JOIN employees e
ON e.manager_id = parent.id
Without this index, PostgreSQL will do full table scans at each recursion level.
2. Depth Limiting
Deep recursion can generate millions of rows.
Always apply a maximum depth if possible.
Example:
WITH RECURSIVE hierarchy AS (
SELECT id, manager_id, 0 AS level
FROM employees
WHERE id = 1
UNION ALL
SELECT e.id, e.manager_id, h.level + 1
FROM hierarchy h
JOIN employees e
ON e.manager_id = h.id
WHERE h.level < 5
)
SELECT * FROM hierarchy;
This prevents unbounded recursion.
3. Avoid UNION When Possible
Always prefer:
UNION ALL
instead of:
UNION
Why?
UNION forces PostgreSQL to deduplicate rows, which requires sorting or hashing.
Recursive queries already run multiple iterations, so deduplication can significantly slow execution.
4. Filter Early
Push filters into the recursive query, not the final query.
Bad:
SELECT * FROM hierarchy
WHERE department = 'Engineering'
Better:
JOIN employees e
ON e.manager_id = h.id
AND e.department = 'Engineering'
This reduces rows during recursion.
5. Prevent Cycles (Critical)
In hierarchical data, cycles can occur.
Example:
A -> B -> C -> A
This would cause infinite recursion.
To prevent cycles, track visited nodes.
Example:
WITH RECURSIVE hierarchy AS (
SELECT id, manager_id, ARRAY[id] AS path
FROM employees
WHERE id = 1
UNION ALL
SELECT e.id, e.manager_id, path || e.id
FROM hierarchy h
JOIN employees e
ON e.manager_id = h.id
WHERE NOT e.id = ANY(path)
)
SELECT * FROM hierarchy;
The path array stores visited nodes and prevents revisiting them.
6. Use EXPLAIN to Analyze Performance
Always analyze recursive queries using:
EXPLAIN ANALYZE
Example:
EXPLAIN ANALYZE
WITH RECURSIVE ...
This helps identify:
- Sequential scans
- Slow joins
- Large result sets
Common Real-World Applications
Recursive CTEs power many production systems.
Organizational Hierarchies
CEO -> Managers -> Employees
File Systems
Root
├── Folder A
│ └── File
└── Folder B
Product Categories
Electronics
├── Phones
└── Laptops
Graph Traversal
Find connected nodes
Key Takeaways
Recursive CTEs are one of the most powerful SQL features.
Internally PostgreSQL executes them using:
- Working tables
- Intermediate tables
- Iterative processing
For good performance:
- Create indexes on hierarchy columns
- Limit recursion depth
- Use UNION ALL
- Filter early
- Prevent cycles
Understanding these internals allows you to write efficient recursive queries even on very large datasets.
Inside PostgreSQL Recursive CTEs: Execution Model and Performance Optimization was originally published in Level Up Coding on Medium, where people are continuing the conversation by highlighting and responding to this story.