
The PostgreSQL CTE
PostgreSQL has a lot of powerful features to "get the job done." In fact, I wonder if many of the features should be avoided. Regardless, the feature I want to talk about today is called Common Table Expressions (CTEs). These provide a powerful way to write (potentially) more readable and modular SQL queries. A CTE allows you to define a temporary named result set that can then be referenced within a SELECT or other statement.
And remember, the documentation is probably a better source.
Basic CTE Syntax
The basic syntax of a CTE starts with a 'WITH' keyword, and it goes like this:
WITH cte_name AS (
SELECT column_1, column_2
FROM table_name
)
SELECT * FROM cte_name;
This set of statements will create a CTE called "cte_name", which has a result set of column_1
and column_2
from the table called table_name
. Then, the expected SELECT statement is at the end, selecting everything from the cte_name
CTE.
Key Advantages of CTEs
So that's all well and fine, but what is the advantage to doing this?
Improved Readability
The first thing that sticks out to me is that you have a modular result set within the query. Modularity can help with readability. Theoretically, you could copy that CTE definition and use it in different places. Or, the CTE could create a sort of temporary VIEW, so you don't have to define that elsewhere.
A CTE can break down complex queries into more digestible parts, or logical sections. Just that alone could allow a large and complex query to become more manageable, because you can target smaller pieces of that query.
Recursive Queries
The interesting thing about CTEs, is that they can be recursive. Meaning, the CTE being defined can be iterated over again, potentially providing additional results.
Recursion can be particularly useful in the case of hierarchical data. Say we have this table:
CREATE TABLE team_members (id SERIAL PRIMARY KEY, name VARCHAR(20), manager_id INTEGER);
Then we can insert some datum:
-- Coaches
INSERT INTO team_members (name) VALUES ('Coach Bombay'), ('Jack Reilly');
-- Players
INSERT INTO team_members (name, manager_id) VALUES
('Charlie Conway', 1),
('Fulton Reed', 1),
('Dean Portman', 1),
('Goldberg', 1),
('Adam Banks', 2);
Now if we want to list out the players by coach then player, we can do something like this:
WITH RECURSIVE team_hierarchy AS (
SELECT id, name, manager_id, 1 AS level
FROM team_members
WHERE manager_id IS NULL
UNION ALL
SELECT t.id, t.name, t.manager_id, th.level + 1
FROM team_members t
JOIN team_hierarchy th ON t.manager_id = th.id
)
SELECT * FROM team_hierarchy;
Some Take-aways
- Use CTEs to improve query readability
- Leverage recursive CTEs for hierarchical data
- Consider performance implications for large datasets
- Use meaningful names for your CTEs
And something I didn't mention, you can define multiple CTEs, separating them by a comma.
Conclusion
Well, this is certainly not comparable to a cupholder on your favorite guitar. But, it's not nothing either. By breaking down complex logic into more manageable pieces, CTEs provide a powerful tool for database developers to write modular queries.
Keep your stick on the ice.