10th Sep 2024 14 minutes read The MySQL CTE and How to Use It Gustavo du Mortier MySQL Common Table Expressions Table of Contents Basic Syntax for a MySQL CTE A CTE in MySQL That Returns a Single Value CTE with Grouping and Aggregate Functions Multiple and Nested CTEs Using Multiple CTEs in a Single Query Nesting CTEs Using MySQL CTEs With Other DML Commands Using a CTE with INSERT Using a CTE with UPDATE Recursive CTEs in MySQL MySQL CTEs In a Nutshell Common table expressions (CTEs) in MySQL give even more versatility to this popular database management system. Find out how to use CTEs to simplify complex queries and implement recursion. All programming languages have ways of simplifying problems by breaking them down into parts that can be solved individually. Users can unify the partial results to obtain a single final result. Thanks to something called common table expressions (CTEs), MySQL is now exception. A MySQL CTE allows you to assign a name to a temporary result set and then refer to that result set by name (as if it were a table or a view) in SELECT, INSERT, UPDATE, or DELETE statements. CTEs have been part of the SQL standard since 1999. However, their effective implementation in the dialects of each relational database management system (RDBMS) occurred gradually, as users began to find this new feature useful. MySQL was one of the last of the popular RDBMSs to incorporate support for CTEs. It was only in version 8, released in April 2018, that support for CTEs appeared in MySQL. As we will see below, one of the main uses of MySQL CTE is to implement recursion. You can take our interactive course on recursive queries in MySQL if you want to jump right into this subject. In this course, you can get all the necessary knowledge to master recursive queries in MySQL 8. All you need is a web browser, an Internet connection, and knowledge of basic SQL (including the JOIN and GROUP BY clauses). There are 114 interactive exercises with hints and online help so you can really sharpen your skills. For now, though, let’s focus on the basics of MySQL CTEs, starting with the simplest syntax. Basic Syntax for a MySQL CTE The syntax for writing a MySQL CTE begins with the word WITH, followed by the name of the CTE and a list of the columns that the CTE will return. This list is not mandatory, since it can be left to be defined by the result of the query that defines the CTE. This query must be written in parentheses after the word AS. The basic syntax of common table expressions in MySQL can be summarized as follows: WITH cte_name (column_list) AS ( query ) SELECT * FROM cte_name; After the parenthesis enclosing the CTE query, we have a conventional SQL statement that refers to the CTE by its name, just as if it were a view. The statement can be a DML (Data Manipulation Language) command such as SELECT, INSERT, DELETE, or UPDATE. If you want a quick reference guide to MySQL syntax or commands, bookmark our free MySQL Cheat Sheet. It will help you refresh your memory on common (and not so common) MySQL phrases. A CTE in MySQL That Returns a Single Value Let’s see an example of a CTE created in a MySQL database. We’ll use sales information and show the difference between the amount of each sale and the average sale amount. We have a sales table with detailed sales information. For this example, we are interested in three columns: customer_id, sale_id, and amount. Here’s some sample data: customer_idsale_idamount 800340121507995294.45 7900552315079981045.12 48102066150800112.57 492300211508022499.14 We’ll create a CTE called sales_average that will return a single value: the average of all the values in the amount column of the sales table (i.e. the overall average of all sale amounts). In the external SELECT – the one that invokes the CTE – we will use the value returned by the CTE to calculate the difference between the amount of each sale and the overall average. This will allow us to identify the most significant sales: WITH sales_average AS ( SELECT AVG(amount) AS amount_avg FROM sales ) SELECT customer_id, sale_id, amount, amount - amount_avg as difference FROM sales, sales_average; When MySQL encounters a query like the one above, it first runs the query in the WITH clause and saves its results temporarily under the name of the CTE (sales_average). Then, it runs the external SELECT using sales_average like an ordinary table; this yields the final result set. Running the entire query with the sample data detailed above returns the following results: customer_idsale_idamountdifference 800340121507995294.45-168.7500 7900552315079981045.12582.2500 48102066150800112.57-449.7500 492300211508022499.1436.2500 CTE with Grouping and Aggregate Functions Let’s look at a slightly more complex example: we will create a query that uses a CTE to find the oldest calls in a call center. We have a table called calls that includes the caller_id and the call_time of each call (among some other data like call duration that we won’t include here). These are some rows from the calls table: caller_idcall_time 1-555-9784-45162024-08-11 08:45:23.124 1-549-995-04472024-08-11 17:02:55.045 1-555-9784-45162024-08-12 09:22:14.341 1-549-995-04472024-08-13 11:36:38.229 1-599-1008-99982024-08-14 13:22:59.003 Our CTE, called first_contact, will return the date and time of each caller’s oldest call. In the external SELECT, we join the CTE with the table calls to return all the data of each oldest call. In this example (unlike in the previous one), we’ll detail the names of the columns returned by the CTE in its declaration: WITH first_contact (caller_id, first_call_time) AS ( SELECT caller_id, MIN(call_time) AS first_call_time FROM calls GROUP BY caller_id ) SELECT c.* FROM calls AS c INNER JOIN first_contact AS fc ON fc.caller_id = c.caller_id AND fc.first_call_time = c.call_time; To run the above query, MySQL will first get the results of the subquery under the name first_contact. This will temporarily store all distinct caller_id values together with the oldest call_time of each. Then, the external SELECT will join the temporary results of the previous subquery with the calls table, yielding all data from the oldest call of each caller ID. These are the query results: caller_idcall_timeduration 1-555-9784-45162024-08-11 08:45:2315 1-549-995-04472024-08-11 09:02:55129 1-599-1008-99982024-08-14 13:22:5926 If you’re learning about MySQL CTEs to apply for a job interview, read this compilation of SQL CTE interview questions to see what you’ll be up against. Multiple and Nested CTEs In MySQL, a single SQL statement can contain multiple CTEs. No matter how many CTEs you define in one query, the word WITH is used only once at the beginning of the statement; the CTE definitions are separated by commas. Using Multiple CTEs in a Single Query In the following example, we have a table called customers that stores the area in which each customer is located: customer_idarea 80034012WEST 79005523EAST 48102066CENTER 49230021WEST Suppose we are interested in using the information in this table in conjunction with the sales table. Specifically, we want to obtain the sales totals of the customers belonging to the ‘WEST’ and ‘EAST’ areas. To do this, we will define two CTEs that will allow us to filter only the customers in the areas we are interested in. Then, in the external SELECT statement, we will combine the data from the two CTEs with the sales table to obtain sales averages for each of these areas: WITH customers_west AS ( SELECT * FROM customers WHERE area = 'WEST'), customers_east AS ( SELECT * FROM customers WHERE area = 'EAST') SELECT cw.area, AVG(sw.amount) AS amount_avg FROM customers_west AS cw INNER JOIN sales sw ON sw.customer_id = cw.customer_id UNION SELECT ce.area, AVG(se.amount) FROM customers_east AS ce INNER JOIN sales se ON se.customer_id = ce.customer_id; And here is the result: areaamount_avg WEST396.5 EAST1045 Nesting CTEs MySQL also allows us to nest CTEs so that a CTE can reference a previously-defined CTE – all within one single SQL statement. The technique of nesting CTEs allows us to break down large, complex queries into smaller, more manageable subqueries. This enables us to gradually approach the solution to a problem, eventually reducing the final step to a simple SELECT. In the following example, we use two nested CTEs to obtain total sales grouped by country and region. The sales table includes a country column that indicates the country where each sale was made. customer_idsale_idamountcountry 800340121507995294.45United States 7900552315079981045.12Germany 48102066150800112.57Spain 492300211508022499.14Ireland The countries table includes a region column indicating the geographical region to which each country belongs: countryregion United StatesNorth America GermanyEurope SpainEurope MexicoCentral America To obtain sales totals by region, we use two CTEs that perform a stepwise approach to the final solution: WITH sales_by_country AS ( SELECT country, SUM(amount) AS total_sales_by_country FROM sales group BY country ), sales_by_region AS ( SELECT c.region, SUM(s.total_sales_by_country) AS total_sales_by_region FROM sales_by_country s INNER JOIN countries c ON c.country = s.country GROUP BY c.region ) SELECT * FROM sales_by_region; In the above query, we first define a CTE called sales_by_country that groups the data by country and returns each country’s total sales. Then we define a second CTE called sales_by_region that joins the sales_by_country CTE with the countries table and groups the sales totals by region. Finally, the entire query is solved with a simple SELECT from the sales_by_region CTE. Using MySQL CTEs With Other DML Commands The previous examples used CTEs to solve complex SELECT queries, reducing their complexity to eventually solve them with one simple SELECT. But CTEs can also be used with other commands. In this section, we will see how to use CTEs in INSERT and UPDATE commands. Using a CTE with INSERT Suppose we have an employees table with the columns empl_id (INT), empl_name (VARCHAR) and salary (DECIMAL). Then we have an applicants table with the same columns as employees plus the column aptitude (TINYINT) that indicates whether an applicant has passed the aptitude test (aptitude = 1) or not (aptitude = 0). Only applicants that pass the test are qualified to become employees. Periodically, the rows of the applicants table that have the value 1 in the aptitude column must be inserted in the employees table. For that, we will use a CTE called qualified_applicants to filter the applicants that have passed the aptitude test but have not yet become employees. The data returned by the CTE qualified_applicants is the source data for the INSERT that incorporates it into the employees table. INSERT INTO employees (empl_id, empl_name, salary) WITH qualified_applicants (empl_id, empl_name, salary) AS ( SELECT empl_id, empl_name, salary FROM applicants AS a WHERE a.aptitude = 1 AND NOT EXISTS (SELECT * FROM employees AS e WHERE e.empl_id = a.empl_id) ) SELECT empl_id, empl_name, salary FROM qualified_applicants; When a CTE is used in combination with an INSERT, all the code of the CTE, from the WITH clause to the outer SELECT, is written after the INSERT INTO table (column1, column2, ...) line. To perform this insertion, MySQL will first execute the entire CTE statement and then insert its results into the specified table. The nice thing about the above statement is that it can be executed repeatedly without fear of creating duplicate data or violating a primary key. This is because the WHERE condition within the CTE definition includes a clause that prevents the INSERT from attempting to reinsert data that is already in the employees table. Using a CTE with UPDATE Just as we used a MySQL CTE to insert rows into a table, we can also use a CTE to update the table with new information. In the following example, we will see how to use a CTE in conjunction with an UPDATE command. We will use the same employees table and a new salaries table with the columns empl_id (INT) and salary (DECIMAL). This table stores the updated salaries of each employee. The CTE will return the rows of salaries in which that employee’s salary is higher than it is in the employees table for the same employee. Here’s the full query: WITH raised_salaries (empl_id, salary) AS ( SELECT s.empl_id, s.salary FROM salaries s INNER JOIN employees e ON e.empl_id = s.empl_id WHERE s.salary > e.salary ) UPDATE employees e INNER JOIN raised_salaries rs ON rs.empl_id = e.empl_id SET e.salary = rs.salary; The syntax required to update data from a CTE follows the general form of the MySQL UPDATE JOIN command. When used together with a CTE, the UPDATE JOIN command must be used as a replacement for the external SELECT of the CTE. The name of the CTE is placed in the JOIN clause to combine it with the table to be updated. Obviously, you cannot update the CTE’s fields – you can only update the fields of the table(s) joined with the CTE. Like the INSERT statement we used previously, this combination of MySQL CTE and UPDATE statement can be executed repeatedly without changing a single letter. Each time it is executed, it will update only the rows of employees with a lower salary than the one stated in the salaries table. Recursive CTEs in MySQL In addition to simplifying and clarifying complex queries, CTEs make it possible to implement recursion in MySQL. By being able to resolve recursion using only SQL statements, we avoid the need to employ other programming languages. This lets us write more efficient queries. Recursive CTEs are composed of the following elements: An anchor member that provides the starting value(s) for the recursive sequence. A recursive member that takes the results of the previous iterations and adds new information to them. The anchor member(s) and the recursive member(s) are combined by UNION clauses. A termination condition, which is a WHERE clause (appended to the recursive member) that defines the condition that will determine the end of the recursive cycle. An invocation, or the external SELECT that refers to the recursive CTE by name (as with any other CTE). You will find more detailed information in this article on recursive CTEs. A typical use of recursive functions in programming is the generation of Fibonacci numbers up to a determined value. Fibonacci numbers are used in areas as varied as the development of search algorithms and the simulation of population growth. In the following recursive MySQL CTE, we obtain the first 10 numbers of the Fibonacci sequence: WITH RECURSIVE cte_fib AS ( SELECT 1 counter, CAST(0 AS decimal) fibo, CAST(0 AS decimal) prev UNION ALL SELECT 2, 1, 0 UNION ALL SELECT counter + 1, fibo + prev, fibo FROM cte_fib WHERE counter < 10 AND fibo > 0 ) SELECT counter, fibo FROM cte_fib ORDER BY counter; In the above example, the anchor member is formed by the two first members of the UNION, because they provide the initial elements of the Fibonacci series (the first two rows in the resulting data): SELECT 1 counter, CAST(0 AS decimal) fibo, CAST(0 AS decimal) prev UNION ALL SELECT 2, 1, 0 The recursive member is the third part of the UNION, because it adds a row to the data returned by the preceding iteration of the same CTE: SELECT counter + 1, fibo + prev, fibo FROM cte_fib WHERE counter < 10 AND fibo > 0 And the termination condition is the WHERE clause within the parenthesis: WHERE counter < 10 AND fibo > 0 It states that the recursion should stop when counter reaches a value of 10. And the invocation is the SELECT outside the parentheses, which returns the counter and the Fibonacci number for each row returned by the CTE. There are many other uses of recursive CTEs in MySQL. For example, they can be used to traverse tables with hierarchically ordered information (e.g. organizational charts) or tree or graph data structures. They can even be used for such unconventional SQL tasks as drawing a Christmas tree with the results of a query. MySQL CTEs In a Nutshell Throughout this article, we have taken a tour of CTEs in MySQL. We’ve learned that common table expressions: Simplify complex queries by breaking them down and organizing them into smaller, easier-to-understand parts. This also improves queries’ readability. Help us write more efficient code. A subquery defined as a CTE can be reused in different parts of the same query – even in other subqueries – without having to repeat it. Encapsulate logic without creating unnecessary objects. CTEs encapsulate the logic of subqueries under a name, just as views do. But unlike views, they do not imply the creation of perennial objects in the database. Implement recursion. This is one of the main reasons for the popularity of CTEs, since without them it is rather complicated to implement recursion in MySQL. By now, you should have an idea of what can be done with recursive CTEs in MySQL. After reading this article you will be ready to solidify your knowledge through some practical exercises on CTEs. But what you have seen in this article is just the tip of the iceberg. To discover the full potential of CTEs in MySQL, consider taking LearnSQL.com's course Recursive Queries in MySQL. With its 114 interactive exercises, this course will provide you all the knowledge you need to master recursive queries in MySQL 8. You don’t even need to have access to a MySQL server; the course environment will provide all the necessary tools. All you need is a web browser, an internet connection, and knowledge of basic SQL. You can even take the introductory quiz for free to make sure this course suits your needs! In conclusion, CTEs are a powerful tool in MySQL, particularly when the tidiness and readability of SQL code are valued. In multidisciplinary teams, –where experienced SQL database engineers collaborate with data analysts, database designers, and DBAs –CTEs are a valuable resource to increase team productivity and obtain more maintainable and durable software. Tags: MySQL Common Table Expressions