Inside PostgreSQL Recursive CTEs: Execution Model and Performance Optimization

In a previous discussion, we explored what CTEs and recursive CTEs are and how they help traverse hierarchical data such as:

Organization structures
Folder trees
Category hierarchies
Graph relationships

However, many engineers stop there.

To truly master recursive queries, you must understand how PostgreSQL executes them internally and how to optimize them for performance.

In this article we’ll explore:

PostgreSQL’s internal execution model
The working table and intermediate table
How recursion actually loops internally
Performance optimization strategies
Preventing infinite recursion
Indexing strategies for recursive queries

How PostgreSQL Internally Executes Recursive CTEs

Recursive CTEs are not executed as true recursion like function calls in programming languages.

Instead, PostgreSQL converts the recursive query into an iterative process that repeatedly processes result sets.

Internally, PostgreSQL uses two temporary structures:

Working Table
Intermediate Table

Understanding these two tables is the key to understanding recursive execution.

The Three Core Components of Recursive Execution

Every recursive CTE consists of:

1️Anchor Query (Base Query)
2️ Recursive Query
3️ Termination Condition

Example:

WITH RECURSIVE employee_hierarchy AS (
-- Anchor Query
    SELECT id, name, manager_id, 0 AS level
    FROM employees
    WHERE id = 2
 UNION ALL
    -- Recursive Query
    SELECT e.id, e.name, e.manager_id, h.level + 1
    FROM employee_hierarchy h
    JOIN employees e
    ON e.manager_id = h.id
)
SELECT * FROM employee_hierarchy;

Now let’s see how PostgreSQL executes it internally.

Internal Execution Flow

Internally PostgreSQL executes the recursive CTE in iterations.

The process looks like this:

Anchor Query
     ↓
Working Table
     ↓
Recursive Query
     ↓
Intermediate Table
     ↓
Append to Result
     ↓
Repeat until empty

Let’s break this down step-by-step.

Step 1 — Execute the Anchor Query

PostgreSQL starts by executing the anchor query.

SELECT id, name, manager_id
FROM employees
WHERE id = 2

Example result:

This result is stored in two places:

1️ Result Table
2️ Working Table

Result Table
-------------
2

Working Table
-------------
2

Step 2 — Run the Recursive Query Using the Working Table

The recursive query is executed using the working table as input.

SELECT e.id
FROM employees e
JOIN working_table w
ON e.manager_id = w.id

Suppose this produces:

These results go into the Intermediate Table.

Intermediate Table
------------------
4
5

Step 3 — Move Intermediate Table to Working Table

PostgreSQL then:

1️ Appends the intermediate rows to the Result Table
2️ Replaces the Working Table with the intermediate results

Result Table
-------------
2
4
5

Working Table
-------------
4
5

The intermediate table is then cleared.

Step 4 — Repeat the Recursive Query

The recursive query runs again using the new working table.

Working Table = 4,5

Query:

SELECT employees WHERE manager_id IN (4,5)

If new rows exist, they go to the intermediate table again.

Example:

Intermediate Table
------------------
7
8

Step 5 — Continue Until No Rows Are Returned

The loop continues until:

Intermediate Table = empty

Once no new rows are generated, recursion stops.

The final result is returned from the Result Table.

Visual Execution Model

Iteration 1
-----------
Anchor Query
Working Table = [2]

Iteration 2
-----------
Recursive Query
Intermediate Table = [4,5]
Working Table = [4,5]

Iteration 3
-----------
Recursive Query
Intermediate Table = [7,8]
Working Table = [7,8]

Iteration 4
-----------
Recursive Query
Intermediate Table = []
Stop recursion

Why PostgreSQL Uses This Approach

This iterative model has several advantages:

1️ Prevents Stack Overflow

True recursion could cause deep call stacks.

PostgreSQL instead uses set-based iteration.

2️ Works Efficiently With SQL Execution Engine

PostgreSQL is optimized for set processing, not procedural recursion.

3️ Allows Query Planner Optimization

The planner can optimize joins and filters at each iteration.

Performance Optimization for Recursive Queries

Recursive queries can become very expensive if the hierarchy is large.

Here are the most important optimization techniques.

1. Proper Indexing

The most critical index is on the parent relationship column.

Example table:

employees
---------
id
manager_id

Create this index:

CREATE INDEX idx_employees_manager
ON employees(manager_id);

Why?

Each recursive step performs:

JOIN employees e
ON e.manager_id = parent.id

Without this index, PostgreSQL will do full table scans at each recursion level.

2. Depth Limiting

Deep recursion can generate millions of rows.

Always apply a maximum depth if possible.

Example:

WITH RECURSIVE hierarchy AS (
    SELECT id, manager_id, 0 AS level
    FROM employees
    WHERE id = 1
    UNION ALL
    SELECT e.id, e.manager_id, h.level + 1
    FROM hierarchy h
    JOIN employees e
    ON e.manager_id = h.id
    WHERE h.level < 5
)
SELECT * FROM hierarchy;

This prevents unbounded recursion.

3. Avoid UNION When Possible

Always prefer:

UNION ALL

instead of:

UNION

Why?

UNION forces PostgreSQL to deduplicate rows, which requires sorting or hashing.

Recursive queries already run multiple iterations, so deduplication can significantly slow execution.

4. Filter Early

Push filters into the recursive query, not the final query.

Bad:

SELECT * FROM hierarchy
WHERE department = 'Engineering'

Better:

JOIN employees e
ON e.manager_id = h.id
AND e.department = 'Engineering'

This reduces rows during recursion.

5. Prevent Cycles (Critical)

In hierarchical data, cycles can occur.

Example:

A -> B -> C -> A

This would cause infinite recursion.

To prevent cycles, track visited nodes.

Example:

WITH RECURSIVE hierarchy AS (
    SELECT id, manager_id, ARRAY[id] AS path
    FROM employees
    WHERE id = 1
    UNION ALL
    SELECT e.id, e.manager_id, path || e.id
    FROM hierarchy h
    JOIN employees e
    ON e.manager_id = h.id
    WHERE NOT e.id = ANY(path)
)
SELECT * FROM hierarchy;

The path array stores visited nodes and prevents revisiting them.

6. Use EXPLAIN to Analyze Performance

Always analyze recursive queries using:

EXPLAIN ANALYZE

Example:

EXPLAIN ANALYZE
WITH RECURSIVE ...

This helps identify:

Sequential scans
Slow joins
Large result sets

Common Real-World Applications

Recursive CTEs power many production systems.

Organizational Hierarchies

CEO -> Managers -> Employees

File Systems

Root
 ├── Folder A
 │    └── File
 └── Folder B

Product Categories

Electronics
 ├── Phones
 └── Laptops

Graph Traversal

Find connected nodes

Key Takeaways

Recursive CTEs are one of the most powerful SQL features.

Internally PostgreSQL executes them using:

Working tables
Intermediate tables
Iterative processing

For good performance:

Create indexes on hierarchy columns
Limit recursion depth
Use UNION ALL
Filter early
Prevent cycles

Understanding these internals allows you to write efficient recursive queries even on very large datasets.

Inside PostgreSQL Recursive CTEs: Execution Model and Performance Optimization was originally published in Level Up Coding on Medium, where people are continuing the conversation by highlighting and responding to this story.

Inside PostgreSQL Recursive CTEs: Execution Model and Performance Optimization

How PostgreSQL Internally Executes Recursive CTEs

The Three Core Components of Recursive Execution

Internal Execution Flow

Step 1 — Execute the Anchor Query

Step 2 — Run the Recursive Query Using the Working Table

Step 3 — Move Intermediate Table to Working Table

Step 4 — Repeat the Recursive Query

Step 5 — Continue Until No Rows Are Returned

Visual Execution Model

Why PostgreSQL Uses This Approach

1️ Prevents Stack Overflow

2️ Works Efficiently With SQL Execution Engine

3️ Allows Query Planner Optimization

Performance Optimization for Recursive Queries

1. Proper Indexing

2. Depth Limiting

3. Avoid UNION When Possible

4. Filter Early

5. Prevent Cycles (Critical)

6. Use EXPLAIN to Analyze Performance

Common Real-World Applications

Organizational Hierarchies

File Systems

Product Categories

Graph Traversal

Key Takeaways

NexaPay — Accept Card Payments, Receive Crypto

Related Articles

We’ve all seen the cycle.

Iran internet blackout hits 65 days amid US and Israeli military tensions

Xero vs QuickBooks Desktop: What Actually Happens to Your Data When You Switch?

How to Restore a Xero Organisation After Accidental Deletion

Is Cryptocurrency Haram in Islam?

Can QuickBooks Desktop Files with 10+ Years of History Actually Be Migrated to Xero?