27th Apr 2023 9 minutes read How to Use the SQL RANK OVER (PARTITION BY) Tihomir Babic window functions Table of Contents What Is RANK()? What Does RANK() Do? How RANK() Works With OVER (ORDER BY) How RANK() Works With OVER (PARTITION BY) Bonus Example Add RANK() to Your SQL Vocabulary Ranking data in SQL is a breeze if you know how to use RANK() to rank over a partition. This article shows you how to do it, as well as how RANK() differs from DENSE_RANK() and ROW_NUMBER(). If you work with SQL in a professional setting, you have had to rank data at least once. Think of ranking months/years/quarters by revenue or costs, best-selling products, most viewed posts or streamed songs, employees by salary, most profitable branches, you name it. Or ranking books by their sales. The thing is, you often have to rank data within a certain category, or partition as we call it. The SQL RANK OVER (PARTITION BY) enters the game! This is one typical example of window functions in SQL. For a more detailed explanation of window functions, our window functions course is the best. Through 218 interactive exercises, you learn more about PARTITION BY and other window function clauses, such as ORDER BY, ROWS, and RANGE. These functions are used not only in ranking but also in aggregate and analytics window functions. Let’s get back to book sales. Take a look at this table, with the code for creating it here. How do we rank book sales within each language? idtitleauthororiginal_languagesalesclassify_under 1The HobbitJ. R. R. TolkienEnglish100Fantasy 2Watership DownRichard AdamsEnglish50Fantasy 3Harry Potter and the Philosopher's StoneJ. K. RowlingEnglish120Fantasy 4The PlagueAlbert CamusFrench12Classics 5The Divine ComedyDante AlighieriItalian12Poetry 6War and PeaceLeo TolstoyRussian36Classics 7Nineteen Eighty-FourGeorge OrwellEnglish30Classics 8Andromeda NebulaIvan YefremovRussian20Science fiction 9The Little PrinceAntoine de Saint-ExupéryFrench200Kids 10The StrangerAlbert CamusFrench10Classics 11The Adventures of PinocchioCarlo CollodiItalian35Kids 12The Name of the RoseUmberto EcoItalian50Classics 13One Hundred Years of SolitudeGabriel García MárquezSpanish50Classics 14Don QuixoteMiguel de CervantesSpanish500Classics 15LolitaVladimir NabokovEnglish50Classics The solution is simple when you know that the language, in this example, acts as something called data partition. SELECT original_language, title, author, sales, RANK() OVER (PARTITION BY original_language ORDER BY sales DESC) AS sales_rank FROM books; And here it is! original_languagetitleauthorsalessales_rank EnglishHarry Potter and the Philosopher's StoneJ. K. Rowling1201 EnglishThe HobbitJ. R. R. Tolkien1002 EnglishLolitaVladimir Nabokov503 EnglishWatership DownRichard Adams503 EnglishNineteen Eighty-FourGeorge Orwell305 FrenchThe Little PrinceAntoine de Saint-Exupéry2001 FrenchThe PlagueAlbert Camus122 FrenchThe StrangerAlbert Camus103 ItalianThe Name of the RoseUmberto Eco501 ItalianThe Adventures of PinocchioCarlo Collodi352 ItalianThe Divine ComedyDante Alighieri123 RussianWar and PeaceLeo Tolstoy361 RussianAndromeda NebulaIvan Yefremov202 SpanishDon QuixoteMiguel de Cervantes5001 SpanishOne Hundred Years of SolitudeGabriel García Márquez502 Now, let’s dissect this example. What Is RANK()? In simplest terms, RANK() is a window function. The SQL window functions are similar to aggregate functions in that they are applied to a group of rows. One big difference: window functions keep details of the individual rows, unlike aggregate functions with GROUP BY. What Does RANK() Do? As the name suggests, it ranks data. This makes it a ranking window function, along with DENSE_RANK() and ROW_NUMBER(). When using any of these window functions, it must be accompanied by an OVER (ORDER BY) clause. The OVER() clause is mandatory for any window function. It’s what turns a “regular” function into a window function. For these ranking functions, the ORDER BY in parentheses defines the order in which ranking is done. This order may be ascending or descending. Keep in mind that it does not affect the order of the rows in the result; that is done with an ORDER BY at the end of the query. Learn more about this in our article about the RANK() window function. How RANK() Works With OVER (ORDER BY) Let’s use the dataset above. Take the whole table and rank the books by sales. Let’s see what happens. SELECT title, author, sales, RANK() OVER (ORDER BY sales DESC) AS sales_rank FROM books; This is the same as the previous query, except this one does not select the language and does not use PARTITION BY. So, the RANK() function is followed by OVER(). The ORDER BY clause in it tells the function to rank the data by sales in descending order, i.e., from the highest- to the lowest-selling books. Since the PARTITION BY clause is omitted, the function ranks the whole table. Here are the first ten rows of the output. titleauthorsalessales_rank Don QuixoteMiguel de Cervantes5001 The Little PrinceAntoine de Saint-Exupéry2002 Harry Potter and the Philosopher's StoneJ. K. Rowling1203 The HobbitJ. R. R. Tolkien1004 LolitaVladimir Nabokov505 Watership DownRichard Adams505 The Name of the RoseUmberto Eco505 One Hundred Years of SolitudeGabriel García Márquez505 War and PeaceLeo Tolstoy369 The Adventures of PinocchioCarlo Collodi3510 The critical thing to notice here: there are four books with 50 million copies sold, and all are ranked fifth. This is how RANK() differs from the other two ranking window functions – it gives the same rank to tied values. When the function reaches the next sales value (in this case, 36 million copies sold), it does not assign the next consecutive rank value (6) but rather skips to adjust for the count of tied sales values. As mentioned, the rank “5” appears four times; therefore, the next rank assigned is nine. DENSE_RANK() also ranks the ties with the same ranking. However, unlike RANK(), it does not skip rank values based on ties. The same data ranked with DENSE_RANK() is as follows. titleauthorsalessales_rank Don QuixoteMiguel de Cervantes5001 The Little PrinceAntoine de Saint-Exupéry2002 Harry Potter and the Philosopher's StoneJ. K. Rowling1203 The HobbitJ. R. R. Tolkien1004 LolitaVladimir Nabokov505 Watership DownRichard Adams505 The Name of the RoseUmberto Eco505 One Hundred Years of SolitudeGabriel García Márquez505 War and PeaceLeo Tolstoy366 The Adventures of PinocchioCarlo Collodi357 After several books ranked fifth, the next rank is sixth, not ninth as with RANK(). What about ROW_NUMBER()? It does not care about ties or skipping. It just ranks rows sequentially. The first ten rows using ROW_NUMBER() are below. titleauthorsalessales_rank Don QuixoteMiguel de Cervantes5001 The Little PrinceAntoine de Saint-Exupéry2002 Harry Potter and the Philosopher's StoneJ. K. Rowling1203 The HobbitJ. R. R. Tolkien1004 LolitaVladimir Nabokov505 Watership DownRichard Adams506 The Name of the RoseUmberto Eco507 One Hundred Years of SolitudeGabriel García Márquez508 War and PeaceLeo Tolstoy369 The Adventures of PinocchioCarlo Collodi3510 The differences between these functions are explained in the ranking functions overview. How RANK() Works With OVER (PARTITION BY) The PARTITION BY clause divides data into partitions or subsets. When used with RANK(), this means the data is ranked within the partition. When it reaches the second partition, the ranking is reset to start from one. Let’s look at the query from the beginning of this article again to clarify this. SELECT original_language, title, author, sales, RANK() OVER (PARTITION BY original_language ORDER BY sales DESC) AS sales_rank FROM books; In this query, RANK() is used with PARTITION BY. The partition, in this case, is original_language. The ranking is done by sales in descending order as specified in the ORDER BY clause. The way in which we have written PARTITION BY and ORDER BY means that books are ranked by sales but within each language category. Once the function ranks all the books in one language, it restarts as it reaches the second language, and so on. We see this in the query output. original_languagetitleauthorsalessales_rank EnglishHarry Potter and the Philosopher's StoneJ. K. Rowling1201 EnglishThe HobbitJ. R. R. Tolkien1002 EnglishLolitaVladimir Nabokov503 EnglishWatership DownRichard Adams503 EnglishNineteen Eighty-FourGeorge Orwell305 FrenchThe Little PrinceAntoine de Saint-Exupéry2001 FrenchThe PlagueAlbert Camus122 FrenchThe StrangerAlbert Camus103 ItalianThe Name of the RoseUmberto Eco501 ItalianThe Adventures of PinocchioCarlo Collodi352 ItalianThe Divine ComedyDante Alighieri123 RussianWar and PeaceLeo Tolstoy361 RussianAndromeda NebulaIvan Yefremov202 SpanishDon QuixoteMiguel de Cervantes5001 SpanishOne Hundred Years of SolitudeGabriel García Márquez502 We have marked each partition with a different color to see the different partitions easily. The highest-selling book in English is J.K. Rowling’s Harry Potter and the Philosopher's Stone. Then it is The Hobbit. Lolita and Watership Down are both ranked third, as RANK() assigns the same rank to books with the same sales value. Then one rank value is skipped, and Nineteen Eighty-Four is ranked fifth. The next partition is the French language, and the ranking is restarted. The Little Prince is ranked the best-selling book in French. We see the same logic works for books in Italian, Russian, and Spanish. Bonus Example We have studied these queries to show how RANK() OVER (PARTITION BY) works. Now, let’s practice! The following isn’t much different from the first query. You should not have problems applying what you have learned. There’s a column in the books table named classify_under. It specifies the category under which each book is to be placed in the bookstore. Let’s rank the books by sales for each category. SELECT classify_under, title, author, sales, RANK() OVER (PARTITION BY classify_under ORDER BY sales DESC) AS sales_rank FROM books; The query selects the column classify_under instead of original_language in the first query. All other selected columns are the same. There’s also a difference in RANK(). Since we’re ranking by classify_under, this is the column that must be in the PARTITION BY clause. Once again, we want to rank books by sales in descending order. Here’s the ranking: classify_undertitleauthorsalessales_rank ClassicsDon QuixoteMiguel de Cervantes5001 ClassicsLolitaVladimir Nabokov502 ClassicsThe Name of the RoseUmberto Eco502 ClassicsOne Hundred Years of SolitudeGabriel García Márquez502 ClassicsWar and PeaceLeo Tolstoy365 ClassicsNineteen Eighty-FourGeorge Orwell306 ClassicsThe PlagueAlbert Camus127 ClassicsThe StrangerAlbert Camus108 FantasyHarry Potter and the Philosopher's StoneJ. K. Rowling1201 FantasyThe HobbitJ. R. R. Tolkien1002 FantasyWatership DownRichard Adams503 KidsThe Little PrinceAntoine de Saint-Exupéry2001 KidsThe Adventures of PinocchioCarlo Collodi352 PoetryThe Divine ComedyDante Alighieri121 Science fictionAndromeda NebulaIvan Yefremov201 By pure coincidence, there are five partitions again. In the “Classics” category, Don Quixote is the best-selling book. Then, there are three books ranked second. The ranking sequence is skipped until we get to War and Peace in fifth place. The rest of the Classics are ranked sequentially since there are no more ties. In other categories, there are no ties. “Poetry” and “Science fiction” have only one book in each category. So, there’s just the first rank. For more examples with other window functions, go to our article explaining how to use PARTITION BY. Add RANK() to Your SQL Vocabulary We have seen the most typical uses of the RANK() window function. While it requires an OVER (ORDER BY), the PARTITION BY clause unlocks its possibilities. It makes RANK() a sophisticated tool for ranking data across one or more partitions with ease in your day-to-day work. To learn more about and practice RANK() and other (ranking) window functions, use our Window Functions course. You get an even more detailed explanation of ranking and a chance to write plenty of code in our exercises. Happy ranking! Tags: window functions