Start now →

Inside PostgreSQL Recursive CTEs: Execution Model and Performance Optimization

By Usama Sarwar · Published March 11, 2026 · 5 min read · Source: Level Up Coding
Blockchain
Inside PostgreSQL Recursive CTEs: Execution Model and Performance Optimization

In a previous discussion, we explored what CTEs and recursive CTEs are and how they help traverse hierarchical data such as:

However, many engineers stop there.

To truly master recursive queries, you must understand how PostgreSQL executes them internally and how to optimize them for performance.

In this article we’ll explore:

  1. PostgreSQL’s internal execution model
  2. The working table and intermediate table
  3. How recursion actually loops internally
  4. Performance optimization strategies
  5. Preventing infinite recursion
  6. Indexing strategies for recursive queries

How PostgreSQL Internally Executes Recursive CTEs

Recursive CTEs are not executed as true recursion like function calls in programming languages.

Instead, PostgreSQL converts the recursive query into an iterative process that repeatedly processes result sets.

Internally, PostgreSQL uses two temporary structures:

  1. Working Table
  2. Intermediate Table

Understanding these two tables is the key to understanding recursive execution.

The Three Core Components of Recursive Execution

Every recursive CTE consists of:

1️Anchor Query (Base Query)
2️ Recursive Query
3️ Termination Condition

Example:

WITH RECURSIVE employee_hierarchy AS (
-- Anchor Query
SELECT id, name, manager_id, 0 AS level
FROM employees
WHERE id = 2
UNION ALL
-- Recursive Query
SELECT e.id, e.name, e.manager_id, h.level + 1
FROM employee_hierarchy h
JOIN employees e
ON e.manager_id = h.id
)
SELECT * FROM employee_hierarchy;

Now let’s see how PostgreSQL executes it internally.

Internal Execution Flow

Internally PostgreSQL executes the recursive CTE in iterations.

The process looks like this:

Anchor Query

Working Table

Recursive Query

Intermediate Table

Append to Result

Repeat until empty

Let’s break this down step-by-step.

Step 1 — Execute the Anchor Query

PostgreSQL starts by executing the anchor query.

SELECT id, name, manager_id
FROM employees
WHERE id = 2

Example result:

This result is stored in two places:

1️ Result Table
2️ Working Table

Result Table
-------------
2
Working Table
-------------
2

Step 2 — Run the Recursive Query Using the Working Table

The recursive query is executed using the working table as input.

SELECT e.id
FROM employees e
JOIN working_table w
ON e.manager_id = w.id

Suppose this produces:

These results go into the Intermediate Table.

Intermediate Table
------------------
4
5

Step 3 — Move Intermediate Table to Working Table

PostgreSQL then:

1️ Appends the intermediate rows to the Result Table
2️ Replaces the Working Table with the intermediate results

Result Table
-------------
2
4
5
Working Table
-------------
4
5

The intermediate table is then cleared.

Step 4 — Repeat the Recursive Query

The recursive query runs again using the new working table.

Working Table = 4,5

Query:

SELECT employees WHERE manager_id IN (4,5)

If new rows exist, they go to the intermediate table again.

Example:

Intermediate Table
------------------
7
8

Step 5 — Continue Until No Rows Are Returned

The loop continues until:

Intermediate Table = empty

Once no new rows are generated, recursion stops.

The final result is returned from the Result Table.

Visual Execution Model

Iteration 1
-----------
Anchor Query
Working Table = [2]
Iteration 2
-----------
Recursive Query
Intermediate Table = [4,5]
Working Table = [4,5]
Iteration 3
-----------
Recursive Query
Intermediate Table = [7,8]
Working Table = [7,8]
Iteration 4
-----------
Recursive Query
Intermediate Table = []
Stop recursion

Why PostgreSQL Uses This Approach

This iterative model has several advantages:

1️ Prevents Stack Overflow

True recursion could cause deep call stacks.

PostgreSQL instead uses set-based iteration.

2️ Works Efficiently With SQL Execution Engine

PostgreSQL is optimized for set processing, not procedural recursion.

3️ Allows Query Planner Optimization

The planner can optimize joins and filters at each iteration.

Performance Optimization for Recursive Queries

Recursive queries can become very expensive if the hierarchy is large.

Here are the most important optimization techniques.

1. Proper Indexing

The most critical index is on the parent relationship column.

Example table:

employees
---------
id
manager_id

Create this index:

CREATE INDEX idx_employees_manager
ON employees(manager_id);

Why?

Each recursive step performs:

JOIN employees e
ON e.manager_id = parent.id

Without this index, PostgreSQL will do full table scans at each recursion level.

2. Depth Limiting

Deep recursion can generate millions of rows.

Always apply a maximum depth if possible.

Example:

WITH RECURSIVE hierarchy AS (
SELECT id, manager_id, 0 AS level
FROM employees
WHERE id = 1
UNION ALL
SELECT e.id, e.manager_id, h.level + 1
FROM hierarchy h
JOIN employees e
ON e.manager_id = h.id
WHERE h.level < 5
)
SELECT * FROM hierarchy;

This prevents unbounded recursion.

3. Avoid UNION When Possible

Always prefer:

UNION ALL

instead of:

UNION

Why?

UNION forces PostgreSQL to deduplicate rows, which requires sorting or hashing.

Recursive queries already run multiple iterations, so deduplication can significantly slow execution.

4. Filter Early

Push filters into the recursive query, not the final query.

Bad:

SELECT * FROM hierarchy
WHERE department = 'Engineering'

Better:

JOIN employees e
ON e.manager_id = h.id
AND e.department = 'Engineering'

This reduces rows during recursion.

5. Prevent Cycles (Critical)

In hierarchical data, cycles can occur.

Example:

A -> B -> C -> A

This would cause infinite recursion.

To prevent cycles, track visited nodes.

Example:

WITH RECURSIVE hierarchy AS (
SELECT id, manager_id, ARRAY[id] AS path
FROM employees
WHERE id = 1
UNION ALL
SELECT e.id, e.manager_id, path || e.id
FROM hierarchy h
JOIN employees e
ON e.manager_id = h.id
WHERE NOT e.id = ANY(path)
)
SELECT * FROM hierarchy;

The path array stores visited nodes and prevents revisiting them.

6. Use EXPLAIN to Analyze Performance

Always analyze recursive queries using:

EXPLAIN ANALYZE

Example:

EXPLAIN ANALYZE
WITH RECURSIVE ...

This helps identify:

Common Real-World Applications

Recursive CTEs power many production systems.

Organizational Hierarchies

CEO -> Managers -> Employees

File Systems

Root
├── Folder A
│ └── File
└── Folder B

Product Categories

Electronics
├── Phones
└── Laptops

Graph Traversal

Find connected nodes

Key Takeaways

Recursive CTEs are one of the most powerful SQL features.

Internally PostgreSQL executes them using:

For good performance:

Understanding these internals allows you to write efficient recursive queries even on very large datasets.


Inside PostgreSQL Recursive CTEs: Execution Model and Performance Optimization was originally published in Level Up Coding on Medium, where people are continuing the conversation by highlighting and responding to this story.

This article was originally published on Level Up Coding and is republished here under RSS syndication for informational purposes. All rights and intellectual property remain with the original author. If you are the author and wish to have this article removed, please contact us at [email protected].

NexaPay — Accept Card Payments, Receive Crypto

No KYC · Instant Settlement · Visa, Mastercard, Apple Pay, Google Pay

Get Started →