20th Apr 2023 10 minutes read

CTE vs. Subquery in SQL: What’s the Difference?

Table of Contents

What Is a Subquery?
And What Are CTEs?
What Are the Differences Between CTEs and Subqueries?
CTE vs. Subquery

What are Common Table Expressions (CTEs)? Are they the same as subqueries? When would you need to use CTEs? This article looks at the similarities and differences between CTE vs subquery.

When I introduce a student to Common Table Expressions, their first reaction is “That’s just a subquery! Why do I need to learn that?”. Let’s answer this question by looking at what you can do with an SQL subquery and what extra advantages there are in using a CTE.

If you decide after reading the article that you would like to learn how to use SQL CTEs, LearnSQL.com offers a comprehensive Common Table Expressions course that gives in-depth coverage of the topic. Beginning with simple examples, you'll work through to difficult concepts like processing trees and graphs. Each topic has step-by-step explanations and examples. With 114 interactive exercises, you'll get plenty of practice. You'll probably need about 18 hours to complete the course.

What Is a Subquery?

A subquery is a query nested inside the main query; this is best explained by looking at an example. All examples in this article use SQL Server; the syntax may be slightly different in other SQL dialects.

Suppose management wants to offer a discount to all customers whose annual purchases were higher than the average for all customers. The query to extract a list of these customers could look like this:

SELECT 
  account_no, 
  name 
FROM customers 
WHERE annual_purchases >
  (SELECT AVG(annual_purchases) FROM customers);

The main query lists the customers, and the subquery calculates the average annual purchases for all customers. The subquery doesn't need a name (except when you're working in PostgreSQL).

If you're not familiar with subqueries, you may like to have a look at LearnSQL.com's SQL Basics course, which has a section devoted to subqueries. You can read more about subqueries in our Beginner’s Guide to the SQL Subquery. How to Practice SQL Subqueries gives you some ideas on how to put these concepts into practice.

And What Are CTEs?

Common Table Expressions are named result sets that are defined at the front of a query and can be accessed by the query just as if they were tables. Suppose we wanted to compare employees' salaries to the average salary for their role. The query may look like this:

WITH avg_salary AS (
  SELECT 
    role, 
    avg(salary) AS average 
  FROM employee 
  GROUP BY role
)
SELECT 
  employee.role, 
  name, 
  salary, 
  avg_salary
FROM employee 
JOIN avg_salary ON avg_salary.role = employee.role
ORDER BY role, name

CTEs are introduced by the keyword WITH, and the results are stored in a named temporary table. In this example, the results of the CTE are stored in the table avg_salary, and this is used by the main query to pick up the average salary for each role.

In fact, simple CTEs like this one could equally well be written as subqueries. The article "What Is a CTE?" explains CTEs in more detail.

What Are the Differences Between CTEs and Subqueries?

I'll start by briefly listing the major differences, then I'll look at some of them in more detail.

CTEs are defined at the front of the query, whereas subqueries are defined inline.
CTEs must always be named. Only PostgreSQL insists that subqueries must have a name.
CTEs can be used recursively. I'll explain this later in the article.
CTEs are much more readable than subqueries when you're writing a complex report.
A CTE can be used many times within a query, whereas a subquery can only be used once. This can make the query definition much shorter, but it won't necessarily result in improved performance.
Subqueries can be used in a WHERE clause in conjunction with the keywords IN or EXISTS, but you can't do this with CTEs.
Subqueries can be used to pick up a single piece of data from one table in order to update a value in another table.

Some functionality is limited to subqueries only. I’ll give two examples where subqueries cannot be replaced with CTEs. In the first example, the subquery forms part of the WHERE clause. In the second, it’s used to pick up a single value from one table in order to update another. As I mentioned above, SQL doesn’t allow you to use CTEs for these tasks.

Subqueries Only: Using Subqueries in WHERE

In the first example, a bank holds details of all the transactions for the day in a table called daily_trans. The data in this table includes an account number, a transaction code, and an amount.

The database also has a table called transaction_types, and its columns include the transaction code and an indicator called debit_credit, which is set to 1 for transaction types that credit the customer's account, and 2 for those that debit the account.

If the bank wants a list of all credit transactions for the day, the query could look like this:

SELECT 
  account_no, 
  tran_code, 
  amount
FROM daily_trans
WHERE tran_code IN
  (SELECT tran_code 
   FROM transaction_types
   WHERE debit_credit = 1);

This type of query can only be written using a subquery. You couldn’t replace the subquery with a CTE here.

Subqueries Only: Using Subqueries in UPDATE

In the second example, the same bank has a table called customer, whose columns include an account number, a customer name, and the employee number of their designated customer support person.

The bank has done some reshuffling of responsibilities and wants to re-assign the support person for some customers. To do this, they've created a table called reassignments that holds the old support employee's ID number and the employee number of the support person who will take over their responsibilities.

To carry out the reassignments, we could write a query like this:

UPDATE customer 
SET support_person =
  (SELECT new_employee 
   FROM reassignments
   WHERE old_employee = customer.support_person);

CTEs Make a Complex Query More Readable

To illustrate how CTEs can make things easier to understand, let's take a query that uses multiple subqueries and recode it using CTEs.

Suppose we have a store that sells three types of product: books, music, and videos. The manager wants to know how much each customer has purchased in each category.

The report may look something like this:

Customer	Total	Books	Music	Videos
M. Mouse	150	60	40	50
F. Flintstone	90	10	20	60

Here’s a query that uses subqueries to produce the report:

SELECT 
	customer, 
	sum(purchases) AS Total, 
	total_books AS Books, 
	total_music AS Music, 
	total_videos AS Videos
FROM sales
JOIN 
	(SELECT account_no, sum(purchases) AS total_books FROM sales 
	WHERE product_type = 'Books'
	GROUP BY account_no) books
ON books.account_no = sales.account_no
JOIN 
	(SELECT account_no, sum(purchases) AS total_music FROM sales 
	WHERE product_type = 'Music'
	GROUP BY account_no) music
ON music.account_no = sales.account_no
JOIN 
	(SELECT account_no, sum(purchases) AS total_videos FROM sales 
	WHERE product_type = 'Videos'
	GROUP BY account_no) videos
ON videos.account_no = sales.account_no
GROUP BY customer
ORDER BY customer

As you can see, it's quite complicated. And it’s hard to follow if someone else needs to make changes to it later.

Now let's see what this query might look like if we rewrite it using Common Table Expressions:

WITH books AS (
  SELECT 
    customer, 
    sum(purchases) AS total_books 
  FROM sales 
  WHERE product_type = 'Books'
  GROUP BY customer
),

music AS (
  SELECT 
    customer, 
    sum(purchases) AS total_music 
  FROM sales 
  WHERE product_type = 'Music'
  GROUP BY customer
),

videos as (
  SELECT 
    customer, 
    sum(purchases) AS total_videos 
  FROM sales 
  WHERE product_type = 'Videos'
  GROUP BY customer
)

SELECT 
  customer, 
  sum(purchases) AS Total, 
  total_books AS Books, 
  total_music AS Music, 
  total_videos AS Videos
FROM sales
JOIN books ON books.customer = sales.customer
JOIN music ON music.customer = sales.customer
JOIN videos ON videos.customer = sales.customer
GROUP BY customer
ORDER BY customer

Most people will find this second version much easier to understand, although both queries produce exactly the same results.

What Are Recursive CTEs?

I said earlier that only CTEs can be recursive. So what's a recursive query? Recursive queries allow you to navigate hierarchical data, and produce reports suitable for tree- and graph-type data. Examples of hierarchical data include:

In an organization, an employee may report to a sub-manager; the sub-manager reports to a manager, and the manager reports to the b
In manufacturing, a product may be made of several components. Each component could also be made of many sub-components, and the sub-components could be made from various raw materials.

Let's look at an example. An ice cream parlour has several items on the menu. Each menu item may be made from several ingredients: a banana split is made up of bananas, chocolate sauce, and ice cream. But the chocolate sauce also has several ingredients. These may include cocoa powder, sugar and some other things.

The owner wants a complete list of each menu item followed by all its ingredients. Part of the list may look like this:

Item	Ingredient
Menu	150
Menu > Banana Split	Banana Split
Menu > Banana Split > Banana	Banana
Menu > Banana Split > Chocolate Sauce	Chocolate Sauce
Menu > Banana Split > Chocolate Sauce > Cocoa	Cocoa
Menu > Banana Split > Chocolate Sauce > Margarine	Margarine
Menu > Banana Split > Chocolate Sauce > Sugar	Sugar
Menu > Banana Split > Ice cream - American	Ice cream - American
Menu > Banana Split > Ice cream - American > Cream	Cream
Menu > Banana Split > Ice cream - American > Milk	Milk
Menu > Banana Split > Ice cream - American > Sugar	Sugar
Menu > Banana Split > Ice cream - American > Vanilla Extract	Vanilla Extract
Menu > Choc Nut Sundae	Choc Nut Sundae
Menu > Choc Nut Sundae > Chocolate Sauce	Chocolate Sauce
Menu > Choc Nut Sundae > Chocolate Sauce > Cocoa	Cocoa
Menu > Choc Nut Sundae > Chocolate Sauce > Margarine	Margarine
Menu > Choc Nut Sundae > Chocolate Sauce > Sugar	Sugar
Menu > Choc Nut Sundae > Ice cream - Rich	Ice cream - Rich
Menu > Choc Nut Sundae > Ice cream - Rich > Cream	Cream
Menu > Choc Nut Sundae > Ice cream - Rich > Egg	Egg
Menu > Choc Nut Sundae > Ice cream - Rich > Sugar	Sugar
Menu > Choc Nut Sundae > Ice cream - Rich > Vanilla Extract	Vanilla Extract
Menu > Choc Nut Sundae > Mixed Nuts	Mixed Nuts

In the database, we have two tables:

The Item table holds a list of each item in the menu and each ingredient.
The Bill_of_materials table holds links between each item and its ingredients.

The Items table holds this information:

id	Description	unit	price
15	Menu	NULL	NULL
14	Egg	Each	0.1
13	Banana	Each	0.2
12	Banana Split	Each	2
11	Margarine	Kg	4
10	Cocoa	Kg	10
9	Chocolate Sauce	Litre	8
8	Mixed Nuts	Kg	2
7	Choc Nut Sundae	Each	1.5
6	Ice Cream - Rich	Litre	6
5	Sugar	Kg	2
4	Vanilla Extract	Bottle	1
3	Milk	Litre	1.5
2	Cream	Litre	4
1	Ice Cream - American	Litre	5

Here are some examples of entries in the Bill_of_materials table. The column item_id holds a link to the parent item in the Items table, whereas component_id holds a link to one of its ingredients. The first entry, therefore, shows that Item 10: Cocoa is an ingredient of Item 9: Chocolate Sauce.

id	item_id	component_id	quantity
1	9	10	0.25
2	9	11	0.25
3	9	5	0.25
4	1	2	0.5

Here’s the recursive query used to traverse this information. The query was written in SQL Server; other dialects would be slightly different.

WITH menu_ingredients (id, path, description, item_id)
AS (
  SELECT 
CAST (id AS bigint), 
CAST (description as varchar (255)), 
CAST ('' AS varchar(40)),
CAST (id AS bigint)
   FROM items 
   WHERE description = 'Menu'
   
   UNION ALL
   
   SELECT
CAST (bom.component_id AS bigint), 
CAST (m.path + ' > ' + i.description AS varchar(255)),
i.description,
CAST (bom.item_id AS bigint)
   FROM menu_ingredients m, bill_of_materials bom
   JOIN items i
   ON i.id = bom.component_id
   WHERE bom.item_id = m.id
)

SELECT 
  path, 
  description 
FROM menu_ingredients
ORDER BY path

A full explanation of what’s going on here is outside the scope of this article, but I’ll quickly explain the basics. Briefly, when a row is added to the result set in a recursive query, it can 'see' the previous row and use it to pick up a piece of information that can be used to find the next row. This query begins by picking up the top level entry in the bill of materials: the menu itself. From there, it can cycle through all its 'child' rows – the ingredients it's made from. And each ingredient can pick up its own child rows, if it has any. For a more detailed explanation of recursion, check out What Is a Recursive CTE in SQL?. And as I mentioned earlier, recursive queries are fully covered in LearnSQL.com's Common Table Expressions course.

CTE vs. Subquery

In summary, choose a CTE when:

You want to make a complex query more readable.
You need to use a recursive query.

Choose a subquery when:

You're using the WHERE clause keywords IN or EXISTS to pick up the selection criteria from another table.
You want to select a single piece of data from another table as the new value for a field in an UPDATE statement.

In all other circumstances, it's your choice: you can use whichever you feel most comfortable with. If you want more examples of where to use Subquery and where to use CTE - check out this tutorial.

And if you need some practice with either CTEs or subqueries, you might like to try the SQL Practice Set, which has hundreds of interactive exercises to help you consolidate your skills.

Happy learning!

Tags: