15th Jun 2023 10 minutes read An Overview of MySQL Window Functions Dominika Florczykowska MySQL window functions Table of Contents What Are Window Functions in MySQL? How to Define a Window Function: The OVER() Clause Examples of Window Function Queries in MySQL Example 1: Empty OVER() Clause - Computing a Statistic for All Rows Example 2: OVER() with ORDER BY - Computing a Ranking Example 3: OVER() with PARTITION BY - Computing a Statistic for Each Partition Example 4: OVER() with PARTITION BY and ORDER BY - Computing a Statistic for Each Partition in a Specific Order Ready to Practice MySQL Window Functions? MySQL window functions are very helpful when you want to create meaningful reports with SQL. In this article, we’ll demonstrate the most common window functions in MySQL and explain how to use them. MySQL 8.0 introduced a new feature: window functions. These functions are very useful to data analysts and anyone who creates reports using MySQL. Using them, you can easily compute moving averages, cumulative sums, and other calculations over specified subsets of your data. And you can do it without creating complex SQL queries or temporary tables. If you are a regular SQL user, you may have come across window functions in your work. While many databases have had these functions for some time, MySQL lagged behind until 2018. With MySQL 8, the platform is now up to speed! In this article, we will explore the benefits of window functions in MySQL and how they can enhance your data analysis. If you want hands-on practice using window functions, check out our interactive Window Functions in MySQL 8 course. It offers over 200 interactive exercises on MySQL window functions. What Are Window Functions in MySQL? An SQL window function performs calculations across a set of table rows that are related to the current row. This set of rows is called a window or a window frame – that's where the term "window functions" comes from. Let's start with a simple example. Imagine that you would like to calculate the sum of values in all the rows, but you want the result shown on each row. You might need this information to compare individual values with the total during data analysis. This is a breeze if you know how to use window functions! The result of your query would look like this: monthrevenuetotal January10,00080,000 February20,00080,000 March20,00080,000 April30,00080,000 You may have noticed that window functions are similar to aggregate functions. They both compute an aggregate value for a certain group of rows. However – unlike the GROUP BY clause – window functions in SQL do not collapse rows. Instead, the resulting table shows both individual and aggregate values. This can be handy in reports where you need to work with the aggregate and non-aggregate values at the same time. How to Define a Window Function: The OVER() Clause Window functions are defined using the OVER() clause: SELECT …, <window_function> OVER(...), … FROM … The OVER() clause tells the database to use a window function. The simplest form of the window frame is when the brackets are left empty, like this: OVER(). This means that the window consists of all the rows in the table. Additional clauses can be included within the OVER() clause to further define the window. In this article, we will focus on the clauses PARTITION BY and ORDER BY. There are other clauses that can be used within OVER(), but we won’t cover them in this article. If you would like to dive even deeper, check out our Window Functions in MySQL 8 course. Or you can visit this article on MySQL window functions that provides some great examples of how to use them in your queries. Examples of Window Function Queries in MySQL Let's go over some example queries to better understand where and how you can make use of window functions. In our example scenario, we have a website that allows users to participate in quizzes. There are various quiz categories and the maximum number of points quiz participants can score is 100. To store the participants' scores, this website uses the participant table. It has the following columns: id – The ID of the participant, which is also the primary key (PK) of the table. name – The name of the participant. quiz_score – The participant's score. quiz_date – The date when the quiz was attempted. quiz_category – The category of the quiz. Here you can see a handful of rows from the table: idnamequiz_scorequiz_datequiz_category 1Charlee Freeman902023-04-10science 2Christina Rivas252023-04-02history 3Amira Palmer1002023-04-01history 4Carlos Lopez782023-04-04music 5Alba Gomez452023-04-05music 6Michael Doe922023-04-12science 7Anna Smith862023-04-11science Now that you’re familiar with the data, let's dive into using window functions! You might find this SQL Window Functions Cheat Sheet handy as a quick reference guide as we go through the examples. Example 1: Empty OVER() Clause - Computing a Statistic for All Rows Let's say that we would like to return each participant’s score, the category of the quiz they attempted, and the highest score ever achieved in all the quizzes. We can do so using an empty OVER() clause. This way, our window is going to include all the rows in the query. Here's the query that we’d run: SELECT name, quiz_score, quiz_category, MAX(quiz_score) OVER() AS max_score FROM participant; And the result will be as follows: namequiz_scorequiz_categorymax_score Charlee Freeman90science100 Christina Rivas25history100 Amira Palmer100history100 Carlos Lopez78music100 Alba Gomez45music100 Michael Doe92science100 Anna Smith86science100 The MAX() function was applied to all the rows in our query. You can see that the highest score was 100 and it's displayed for all the rows along with the individual scores. You can use the empty OVER() clause with other functions, such as COUNT(), SUM(), AVG(), and others. This lets you compute one overall statistic for all rows in the query; you can compare this overall statistic with the value in each individual row. You can read more about using the OVER() clause in MySQL in our article What Is the MySQL OVER Clause? Example 2: OVER() with ORDER BY - Computing a Ranking When used within the OVER() clause, ORDER BY determines the order in which the rows are ordered in the window frame. Let’s see an example: We can use this query to create a ranking of quiz scores: SELECT name, quiz_score, quiz_category, RANK() OVER(ORDER BY quiz_score DESC) AS rank FROM participant; The RANK() window function assigns a rank to each row within a partition; this rank is based on the value of a specified expression. The first row gets the rank 1, the second row gets the rank 2, etc. More specifically, the RANK() function assigns a unique rank to each distinct value of the expression within the partition. Rows with the same value will have the same rank, and the next rank will be skipped. For example, if two rows have the same value and receive a rank of 1, the next rank assigned will be 3, skipping rank 2. You can read more about ranking window functions in SQL on our blog. Here, we use the RANK() function to calculate the ranking of each participant's quiz score. The OVER() clause with the ORDER BY clause determines the order in which the RANK() function is applied. In this case, the ORDER BY clause is set to quiz_score DESC, which means that the quiz scores are ordered in descending order (from highest to lowest) before the ranking is calculated. The first row (with the highest value) gets rank 1, the second row gets rank 2, etc. Here’s what the code returns: namequiz_scorequiz_categoryrank Amira Palmer100history1 Michael Doe92science2 Charlee Freeman90science3 Anna Smith86science4 Carlos Lopez78music5 Alba Gomez45music6 Christina Rivas25history7 Well done! We were able to assign each participant a rank. Use the OVER (ORDER BY) clause in MySQL when you want to apply a function to the rows in a specific order. This can be useful when calculating running totals, moving averages, and creating various rankings. Check out our MySQL date functions guide to find out more on how to work with date and time functions in MySQL. Example 3: OVER() with PARTITION BY - Computing a Statistic for Each Partition Let's make more use of the category column. Remember the first example query that we went through? For each participant, we displayed their score, the category of the quiz they attempted, and the highest score ever achieved in all quizzes. This time, we would like to do something similar. However, instead of showing the highest score ever achieved in all the quizzes, we will show the highest score ever achieved in that quiz’s category. For this, we’ll need the OVER() clause with PARTITION BY. To partition data in SQL means to divide a set of rows into smaller groups based on a specified column or columns. It's a bit similar to the GROUP BY clause, but window functions don’t collapse the rows. We can use this query: SELECT name, quiz_score, quiz_category, MAX(quiz_score) OVER(PARTITION BY quiz_category) AS max_score_in_category FROM participant; The PARTITION BY clause with the OVER() clause determines the partition of data over which the MAX() function is applied. In this case, the PARTITION BY clause is set to quiz_category, which means that the data is divided into partitions based on the category of the quiz. That's exactly what we wanted! Here's the result: namequiz_scorequiz_categorymax_score_in_category Amira Palmer100history100 Christina Rivas25history100 Carlos Lopez78music78 Alba Gomez45music78 Anna Smith86science92 Michael Doe92science92 Charlee Freeman90science92 For each participant, we displayed both their individual score and the highest score in their category. Sounds fair, as the music quiz could have been more difficult than the science quiz! OVER (PARTITION BY) in MySQL is useful for calculating cumulative sums or average values, creating rankings within groups, identifying the best or worst performers, and much more. This clause provides flexibility and advanced functionality to SQL queries, enabling powerful data analysis and manipulation within subsets of data. You can learn more about using SQL PARTITION BY with OVER elsewhere on our blog. That was easy, right? Let's try something more complicated! Example 4: OVER() with PARTITION BY and ORDER BY - Computing a Statistic for Each Partition in a Specific Order Can we use both PARTITION BY and ORDER BY at the same time? Of course we can! This combination is useful in many situations. In MySQL, using the OVER() clause with PARTITION BY and ORDER BY allows you to perform calculations and analysis on specific partitions of data while controlling the order in which the calculations are applied within each partition. The PARTITION BY clause divides the result set into distinct partitions based on specified columns or expressions. Each partition is treated separately for the calculation or analysis. The ORDER BY clause, when used with OVER(), determines the order in which the data is processed within each partition. It specifies the column or expression by which the data should be sorted. Let's see this duo in action. In this example, we will calculate the cumulative average of quiz scores by category. A cumulative average is the average of a set of values up to a certain point. This is the query that we will use: SELECT name, quiz_date, quiz_score, quiz_category, ROUND( AVG(quiz_score) OVER(PARTITION BY quiz_category ORDER BY quiz_date) ) AS cumulative_avg FROM participant; We'd like to see the cumulative average of quiz scores by category. To achieve this, we used PARTITION BY quiz_category, just like we did last time. Also, it makes sense that the cumulative average is calculated from the oldest to the newest date, so we used ORDER BY quiz_date. This means that the data within each partition is sorted by the quiz date in ascending order (largest to smallest) before the AVG() function is applied. This is what the query output looks like: namequiz_datequiz_scorequiz_categorycumulative_avg Amira Palmer2023-04-01100history100 Christina Rivas2023-04-0225history63 Carlos Lopez2023-04-0478music78 Alba Gomez2023-04-0545music62 Charlee Freeman2023-04-1090science90 Anna Smith2023-04-1186science88 Michael Doe2023-04-1292science89 By using PARTITION BY and ORDER BY together within the OVER() clause, the average calculation is applied separately for each quiz category. Within each category, it is calculated in the order of the quiz dates. This means that for each row in the result set, the average calculation considers only the rows within the same quiz category and orders them by date. The cumulative_avg column will reflect the average score up to the current row for each quiz category, taking into account the order of the quiz dates. Ready to Practice MySQL Window Functions? As you can see, window functions in MySQL are a very powerful tool that can help you create complex reports. You can use window functions in MySQL to create rankings and calculate year-over-year metrics, moving averages, and much more! In this article, we've only scratched the surface of all the ways you can use window functions. If you'd like to dive deeper and explore more examples of how to use them, check out these articles on SQL window functions and the MySQL OVER() clause. And for more hands-on practice, remember to check out our Window Functions in MySQL 8 course! Tags: MySQL window functions