Wicked Smart Data
LearnArticlesAbout
Sign InSign Up
LearnArticlesAboutContact
Sign InSign Up
Wicked Smart Data

The go-to platform for professionals who want to master data, automation, and AI — from Excel fundamentals to cutting-edge machine learning.

Platform

  • Learning Paths
  • Articles
  • About
  • Contact

Connect

  • Contact Us
  • RSS Feed

© 2026 Wicked Smart Data. All rights reserved.

Privacy PolicyTerms of Service
All Articles

Advanced CTEs: Recursive Queries and Hierarchical Data

SQL🌱 Foundation14 min readMay 21, 2026Updated May 21, 2026
Table of Contents
  • Prerequisites
  • Understanding Recursive CTEs
  • Setting Up Our Example Data
  • Building Your First Recursive Query
  • Finding All Paths to the Top
  • Controlling Recursion Depth
  • Aggregating Across Hierarchy Levels
  • Working with Multiple Hierarchies
  • Hands-On Exercise
  • Common Mistakes & Troubleshooting
  • Summary & Next Steps

Advanced CTEs: Recursive Queries and Hierarchical Data

Imagine you're working with employee data at a growing tech company. You need to find everyone who reports to a specific manager, including people who report to those reports, and their reports too — essentially mapping out the entire organizational hierarchy beneath any given person. Or maybe you're analyzing a retail chain and need to trace how products flow through different distribution levels, from warehouses to regional centers to local stores.

These scenarios involve hierarchical data — information organized in parent-child relationships that can extend multiple levels deep. While regular SQL queries excel at working with flat data, they struggle with these tree-like structures where you don't know in advance how many levels you need to traverse.

That's where recursive Common Table Expressions (CTEs) come in. Recursive CTEs let you write queries that call themselves, building results incrementively by following relationships through as many levels as exist in your data. By the end of this lesson, you'll understand how to use this powerful technique to query hierarchical data structures with confidence.

What you'll learn:

  • How recursive CTEs work conceptually and syntactically
  • The anatomy of a recursive CTE: anchor and recursive members
  • Practical techniques for querying organizational hierarchies and tree structures
  • How to control recursion depth and avoid infinite loops
  • Common patterns for aggregating data across hierarchical levels
  • Troubleshooting recursive queries and avoiding performance pitfalls

Prerequisites

Before diving into recursive CTEs, you should be comfortable with:

  • Basic CTE syntax and usage
  • JOINs (particularly self-joins)
  • Aggregate functions and GROUP BY
  • Subqueries

If you need a refresher on regular CTEs, review that topic first, as recursive CTEs build directly on those foundations.

Understanding Recursive CTEs

A recursive CTE is a CTE that references itself within its own definition. Think of it like a mathematical function that calls itself with slightly different parameters each time, eventually building up a complete result.

The classic example is calculating factorials: 5! = 5 × 4!, where 4! = 4 × 3!, and so on. Each step uses the result from the previous step, continuing until reaching a base case (1! = 1).

Recursive CTEs work similarly. They start with a "base case" (called the anchor member), then repeatedly apply a rule (the recursive member) until no new rows can be generated.

Here's the basic syntax structure:

WITH RECURSIVE cte_name AS (
    -- Anchor member (base case)
    SELECT initial_data
    FROM some_table
    WHERE base_condition
    
    UNION [ALL]
    
    -- Recursive member (the recursive step)
    SELECT modified_data
    FROM some_table
    JOIN cte_name ON join_condition
    WHERE recursive_condition
)
SELECT * FROM cte_name;

Let's break this down:

  1. Anchor member: Defines the starting point — the initial rows that begin the recursion
  2. UNION [ALL]: Combines results from the anchor and recursive members
  3. Recursive member: References the CTE by name and typically joins it with another table to find the "next level"
  4. Termination: Recursion stops when the recursive member returns no new rows

Note: Some databases (like PostgreSQL) require the RECURSIVE keyword, while others (like SQL Server) don't use it but support the same functionality.

Setting Up Our Example Data

To make this concrete, let's work with an employee hierarchy table. This represents a common real-world scenario where each employee has a manager, creating a tree structure:

CREATE TABLE employees (
    employee_id INT PRIMARY KEY,
    name VARCHAR(100),
    manager_id INT,
    department VARCHAR(50),
    salary DECIMAL(10,2),
    hire_date DATE
);

INSERT INTO employees VALUES
(1, 'Sarah Chen', NULL, 'Executive', 150000, '2018-01-15'),
(2, 'Marcus Johnson', 1, 'Engineering', 120000, '2019-03-20'),
(3, 'Lisa Rodriguez', 1, 'Sales', 110000, '2019-06-10'),
(4, 'David Kim', 2, 'Engineering', 95000, '2020-02-12'),
(5, 'Amanda Foster', 2, 'Engineering', 98000, '2020-04-18'),
(6, 'Robert Taylor', 3, 'Sales', 85000, '2020-07-25'),
(7, 'Jennifer Wu', 3, 'Sales', 88000, '2020-09-30'),
(8, 'Michael Brown', 4, 'Engineering', 75000, '2021-01-08'),
(9, 'Emily Davis', 4, 'Engineering', 78000, '2021-03-15'),
(10, 'Chris Wilson', 6, 'Sales', 65000, '2021-05-20'),
(11, 'Jessica Moore', 6, 'Sales', 68000, '2021-08-12');

This creates a hierarchy where Sarah Chen (CEO) is at the top, Marcus and Lisa report to her, and so on. Let's visualize this structure:

Sarah Chen (CEO)
├── Marcus Johnson (Engineering Manager)
│   ├── David Kim (Senior Engineer)
│   │   ├── Michael Brown (Engineer)
│   │   └── Emily Davis (Engineer)
│   └── Amanda Foster (Senior Engineer)
└── Lisa Rodriguez (Sales Manager)
    ├── Robert Taylor (Senior Sales Rep)
    │   ├── Chris Wilson (Sales Rep)
    │   └── Jessica Moore (Sales Rep)
    └── Jennifer Wu (Senior Sales Rep)

Building Your First Recursive Query

Let's start with a fundamental question: "Who reports to Marcus Johnson, either directly or indirectly?"

Here's how we solve this with a recursive CTE:

WITH RECURSIVE reporting_chain AS (
    -- Anchor: Start with Marcus Johnson
    SELECT employee_id, name, manager_id, 1 as level
    FROM employees
    WHERE name = 'Marcus Johnson'
    
    UNION ALL
    
    -- Recursive: Find everyone who reports to someone in our current result set
    SELECT e.employee_id, e.name, e.manager_id, rc.level + 1
    FROM employees e
    JOIN reporting_chain rc ON e.manager_id = rc.employee_id
)
SELECT level, name, employee_id
FROM reporting_chain
ORDER BY level, name;

Let's trace through how this works:

Iteration 1 (Anchor):

  • Returns Marcus Johnson (level 1)

Iteration 2 (Recursive):

  • Finds employees where manager_id = Marcus's employee_id
  • Returns David Kim and Amanda Foster (level 2)

Iteration 3 (Recursive):

  • Finds employees where manager_id matches David's or Amanda's employee_id
  • Returns Michael Brown and Emily Davis (level 3)

Iteration 4 (Recursive):

  • Tries to find employees managed by Michael or Emily
  • Returns no rows, so recursion terminates

The result shows the complete hierarchy under Marcus:

level | name           | employee_id
------|----------------|------------
1     | Marcus Johnson | 2
2     | Amanda Foster  | 5
2     | David Kim      | 4
3     | Emily Davis    | 9
3     | Michael Brown  | 8

Key insight: The level column helps track how many steps away from the starting point each person is. This is incredibly useful for understanding the depth of your hierarchy.

Finding All Paths to the Top

Another common pattern is starting from any employee and finding their complete reporting chain up to the CEO. Let's find everyone above Emily Davis:

WITH RECURSIVE management_chain AS (
    -- Anchor: Start with Emily Davis
    SELECT employee_id, name, manager_id, 1 as level
    FROM employees
    WHERE name = 'Emily Davis'
    
    UNION ALL
    
    -- Recursive: Find each person's manager
    SELECT e.employee_id, e.name, e.manager_id, mc.level + 1
    FROM employees e
    JOIN management_chain mc ON e.employee_id = mc.manager_id
)
SELECT level, name, 
       CASE WHEN manager_id IS NULL THEN 'CEO' ELSE 'Reports to ID: ' || manager_id END as role
FROM management_chain
ORDER BY level;

This traverses upward through the hierarchy:

level | name           | role
------|----------------|------------------
1     | Emily Davis    | Reports to ID: 4
2     | David Kim      | Reports to ID: 2
3     | Marcus Johnson | Reports to ID: 1
4     | Sarah Chen     | CEO

Notice how we changed the JOIN condition: instead of finding subordinates (e.manager_id = rc.employee_id), we're finding managers (e.employee_id = mc.manager_id).

Controlling Recursion Depth

Sometimes you only want to traverse a limited number of levels. Most databases provide ways to limit recursion depth. Here's how to find only the direct and immediate subordinates (2 levels) under Sarah Chen:

WITH RECURSIVE reporting_chain AS (
    -- Anchor: Start with Sarah Chen  
    SELECT employee_id, name, manager_id, 1 as level
    FROM employees
    WHERE name = 'Sarah Chen'
    
    UNION ALL
    
    -- Recursive: Find subordinates, but stop at level 3
    SELECT e.employee_id, e.name, e.manager_id, rc.level + 1
    FROM employees e
    JOIN reporting_chain rc ON e.manager_id = rc.employee_id
    WHERE rc.level < 3  -- Stop recursion after level 3
)
SELECT level, name
FROM reporting_chain
ORDER BY level, name;

The WHERE rc.level < 3 condition prevents the recursive member from continuing beyond the desired depth.

You can also use database-specific options. In PostgreSQL, for example:

WITH RECURSIVE reporting_chain AS (
    -- ... same CTE definition ...
)
SELECT * FROM reporting_chain
OPTION (MAXRECURSION 3);

Aggregating Across Hierarchy Levels

Recursive CTEs become particularly powerful when you combine them with aggregation. Let's calculate the total salary cost for each manager, including all their subordinates at any level:

WITH RECURSIVE team_hierarchy AS (
    -- Anchor: Get all managers (people who have subordinates)
    SELECT DISTINCT m.employee_id, m.name as manager_name, 
           e.employee_id as subordinate_id, e.name as subordinate_name, 
           e.salary, 1 as level
    FROM employees m
    JOIN employees e ON m.employee_id = e.manager_id
    
    UNION ALL
    
    -- Recursive: Find subordinates of subordinates
    SELECT th.employee_id, th.manager_name,
           e.employee_id, e.name, e.salary, th.level + 1
    FROM team_hierarchy th
    JOIN employees e ON th.subordinate_id = e.manager_id
),
salary_totals AS (
    SELECT employee_id, manager_name, 
           COUNT(*) as total_subordinates,
           SUM(salary) as total_team_salary
    FROM team_hierarchy
    GROUP BY employee_id, manager_name
)
SELECT st.manager_name, 
       st.total_subordinates,
       st.total_team_salary,
       m.salary as manager_salary,
       st.total_team_salary + m.salary as total_cost
FROM salary_totals st
JOIN employees m ON st.employee_id = m.employee_id
ORDER BY total_cost DESC;

This query reveals the true cost of each part of the organization:

manager_name    | total_subordinates | total_team_salary | manager_salary | total_cost
----------------|-------------------|-------------------|----------------|------------
Marcus Johnson  | 4                 | 336000            | 120000         | 456000
Lisa Rodriguez  | 4                 | 306000            | 110000         | 416000
David Kim       | 2                 | 153000            | 95000          | 248000
Robert Taylor   | 2                 | 133000            | 85000          | 218000

Working with Multiple Hierarchies

Real-world data often contains multiple separate hierarchies. Let's add a second company division:

INSERT INTO employees VALUES
(12, 'Thomas Anderson', NULL, 'Technology', 160000, '2018-05-01'),
(13, 'Nina Patel', 12, 'IT', 105000, '2019-08-15'),
(14, 'Carlos Rivera', 12, 'Security', 108000, '2020-01-20'),
(15, 'Sophie Zhang', 13, 'IT', 82000, '2021-02-28'),
(16, 'Alex Thompson', 14, 'Security', 79000, '2021-06-10');

Now we have two separate hierarchies (one under Sarah Chen, another under Thomas Anderson). To process all hierarchies at once, we can modify our recursive CTE:

WITH RECURSIVE full_hierarchy AS (
    -- Anchor: All top-level managers (no manager_id)
    SELECT employee_id, name, manager_id, 
           employee_id as root_manager, name as root_name,
           1 as level
    FROM employees
    WHERE manager_id IS NULL
    
    UNION ALL
    
    -- Recursive: Build each hierarchy
    SELECT e.employee_id, e.name, e.manager_id,
           fh.root_manager, fh.root_name, fh.level + 1
    FROM employees e
    JOIN full_hierarchy fh ON e.manager_id = fh.employee_id
)
SELECT root_name as organization_head,
       level, name, 
       COUNT(*) OVER (PARTITION BY root_manager) as org_size
FROM full_hierarchy
ORDER BY root_manager, level, name;

This shows both hierarchies with their respective sizes:

organization_head  | level | name              | org_size
-------------------|-------|-------------------|----------
Sarah Chen         | 1     | Sarah Chen        | 11
Sarah Chen         | 2     | Lisa Rodriguez    | 11
Sarah Chen         | 2     | Marcus Johnson    | 11
...
Thomas Anderson    | 1     | Thomas Anderson   | 5
Thomas Anderson    | 2     | Carlos Rivera     | 5
Thomas Anderson    | 2     | Nina Patel        | 5
...

Hands-On Exercise

Now it's time to practice with a realistic scenario. You're analyzing a product category hierarchy for an e-commerce company. Create and populate this table:

CREATE TABLE product_categories (
    category_id INT PRIMARY KEY,
    category_name VARCHAR(100),
    parent_category_id INT,
    product_count INT
);

INSERT INTO product_categories VALUES
(1, 'Electronics', NULL, 0),
(2, 'Clothing', NULL, 0),
(3, 'Home & Garden', NULL, 0),
(4, 'Computers', 1, 0),
(5, 'Mobile Devices', 1, 0),
(6, 'Audio', 1, 0),
(7, 'Men''s Clothing', 2, 0),
(8, 'Women''s Clothing', 2, 0),
(9, 'Laptops', 4, 45),
(10, 'Desktop PCs', 4, 23),
(11, 'Smartphones', 5, 78),
(12, 'Tablets', 5, 34),
(13, 'Headphones', 6, 56),
(14, 'Speakers', 6, 29),
(15, 'Shirts', 7, 120),
(16, 'Pants', 7, 89),
(17, 'Dresses', 8, 156),
(18, 'Shoes', 8, 203);

Your task: Write a recursive CTE that calculates the total product count for each category, including all products in its subcategories. For example, "Electronics" should show the sum of all products in Computers, Mobile Devices, Audio, and their subcategories.

Try this yourself before looking at the solution below.

Solution:

WITH RECURSIVE category_totals AS (
    -- Anchor: Start with leaf categories (those with actual product counts)
    SELECT category_id, category_name, parent_category_id, 
           product_count, 1 as level
    FROM product_categories
    WHERE product_count > 0
    
    UNION ALL
    
    -- Recursive: Roll up totals to parent categories
    SELECT p.category_id, p.category_name, p.parent_category_id,
           ct.product_count, ct.level + 1
    FROM product_categories p
    JOIN category_totals ct ON p.category_id = ct.parent_category_id
),
aggregated_totals AS (
    SELECT category_id, category_name, parent_category_id,
           SUM(product_count) as total_products
    FROM category_totals
    GROUP BY category_id, category_name, parent_category_id
)
SELECT pc.category_name,
       COALESCE(at.total_products, 0) as total_products,
       CASE WHEN pc.parent_category_id IS NULL THEN 'Top Level'
            ELSE 'Subcategory' END as category_type
FROM product_categories pc
LEFT JOIN aggregated_totals at ON pc.category_id = at.category_id
ORDER BY pc.parent_category_id NULLS FIRST, pc.category_name;

This should show Electronics with 265 total products (45+23+78+34+56+29), Clothing with 568 products (120+89+156+203), and so on.

Common Mistakes & Troubleshooting

Infinite Recursion: The most dangerous mistake is creating a cycle in your data that causes infinite recursion. Always include proper termination conditions:

-- BAD: No protection against cycles
WITH RECURSIVE dangerous_query AS (
    SELECT employee_id, manager_id, 1 as level
    FROM employees
    WHERE employee_id = 1
    
    UNION ALL
    
    SELECT e.employee_id, e.manager_id, dq.level + 1
    FROM employees e
    JOIN dangerous_query dq ON e.manager_id = dq.employee_id
    -- No termination condition!
)

-- GOOD: Multiple safety measures
WITH RECURSIVE safe_query AS (
    SELECT employee_id, manager_id, 1 as level,
           CAST(employee_id as VARCHAR(1000)) as path
    FROM employees
    WHERE employee_id = 1
    
    UNION ALL
    
    SELECT e.employee_id, e.manager_id, sq.level + 1,
           sq.path || ',' || e.employee_id
    FROM employees e
    JOIN safe_query sq ON e.manager_id = sq.employee_id
    WHERE sq.level < 10  -- Depth limit
    AND sq.path NOT LIKE '%,' || e.employee_id || ',%'  -- Cycle detection
)

Performance Issues: Recursive CTEs can be expensive. Always:

  • Add appropriate indexes on the columns used in JOIN conditions
  • Use UNION ALL instead of UNION when you don't need to eliminate duplicates
  • Consider limiting recursion depth for large datasets
  • Test with realistic data volumes

Incorrect JOIN Logic: Double-check your JOIN conditions. The most common error is reversing the parent-child relationship:

-- Wrong: This finds managers of people in the CTE
JOIN employees e ON e.employee_id = cte.manager_id

-- Right: This finds subordinates of people in the CTE  
JOIN employees e ON e.manager_id = cte.employee_id

Missing Anchor Cases: Ensure your anchor member returns the correct starting points. If you want all hierarchies, start with rows where parent_id IS NULL. If you want a specific subtree, start with the specific root node.

Performance Tip: When working with large hierarchical datasets, consider materializing frequently-accessed hierarchy paths in a separate table to avoid repeated recursive computation.

Summary & Next Steps

Recursive CTEs are a powerful tool for working with hierarchical and tree-structured data. You now understand how to:

  • Write recursive CTEs with proper anchor and recursive members
  • Traverse hierarchies in both directions (up and down)
  • Control recursion depth and prevent infinite loops
  • Aggregate data across hierarchy levels
  • Handle multiple separate hierarchies in one query

The key to mastering recursive CTEs is practice with real-world hierarchical data. Look for opportunities in your own work — organizational charts, product categories, geographic regions, network structures, or any data with parent-child relationships.

Next steps to advance your skills:

  1. Graph Traversal: Learn about more complex recursive patterns for traversing networks and graphs
  2. Window Functions with Hierarchies: Combine recursive CTEs with advanced window functions for sophisticated analytical queries
  3. Performance Optimization: Study execution plans and indexing strategies for hierarchical queries
  4. Alternative Approaches: Research other methods for storing and querying hierarchical data, like nested sets or closure tables

Recursive CTEs unlock entirely new categories of analytical questions. Once you're comfortable with the basic patterns, you'll find yourself reaching for this technique whenever you encounter data with natural tree structures — and you'll be amazed at how many business problems actually involve hierarchical relationships.

Learning Path: Advanced SQL Queries

Previous

Common Table Expressions (CTEs) for Cleaner SQL

Related Articles

SQL🔥 Expert

SQL Indexes Explained: How They Work and When to Create Them

22 min
SQL⚡ Practitioner

NULL Handling in SQL: IS NULL, COALESCE, and NULLIF

19 min
SQL🌱 Foundation

String Functions and Date Functions in SQL: Complete Data Transformation Guide

15 min

On this page

  • Prerequisites
  • Understanding Recursive CTEs
  • Setting Up Our Example Data
  • Building Your First Recursive Query
  • Finding All Paths to the Top
  • Controlling Recursion Depth
  • Aggregating Across Hierarchy Levels
  • Working with Multiple Hierarchies
  • Hands-On Exercise
  • Common Mistakes & Troubleshooting
  • Summary & Next Steps