29th Jun 2023 10 minutes read How to Use COUNT() with GROUP BY: 5 Practical Examples Radu Gheorghiu GROUP BY aggregate functions numerical functions Table of Contents How to Use COUNT() and GROUP BY When to Use GROUP BY When to Use the COUNT() Function 5 Examples of Using COUNT() with GROUP BY Example #1: GROUP BY a Single Column Example #2: GROUP BY Multiple Columns Example #3: Using WHERE with COUNT() and GROUP BY Example #4: Using ORDER BY with COUNT() and GROUP BY Example #5: COUNT(expression) COUNT(*) - Counting Rows Counting NON-NULL Values in a Column Counting Distinct NON-NULL Values in a Column Want to Know More About COUNT() and GROUP BY? Using the COUNT() function with GROUP BY is one of the most common SQL constructs in aggregate queries. Read this article to find out how to use COUNT() with GROUP BY correctly using 5 examples. In this article, we will explain the importance of using COUNT with GROUP BY. We’ll talk about why it is essential in SQL and how it enables data analysis and summarization based on specific criteria. This combination empowers users to extract meaningful insights, calculate counts, and generate statistical summaries from large datasets. If you’re looking for an in-depth review of basic SQL concepts like COUNT() and GROUP BY, I recommend our interactive SQL Basics course. It contains 129 exercises, which will help you review all key SQL concepts. How to Use COUNT() and GROUP BY If you’re here just for the quick answer, here’s the TLDR: The correct way of using COUNT() with GROUP BY is shown in the query below: SELECT Store, COUNT(*) as NumberOfSales, FROM Sales GROUP BY Store; Let's break down the result to understand how this query works. The rows with the same value in the Store column are grouped together. Imagine an intermediate table where these rows are grouped and marked with different colors, like the image below. This would be our intermediary table containing only the Store column, since that is the column that is part of our SELECT statement. StoreProductIDCustomerID Store A1657 Store A11116 Store A14525 Store B1369 Store B11138 Store C13616 Store C1118 The database then logically counts the number of rows in each group using the COUNT(*) function. This count represents the number of orders for each store. Once the rows are counted, there is no need to have duplicate rows with the same Store value. So, GROUP BY will reduce the number of rows to only unique values. In other words, it will get rid of the individual rows and give us a summary of each group. We end up with the following result: StoreNumberOfSales Store A3 Store B2 Store C2 When to Use GROUP BY GROUP BY is an SQL clause that groups rows based on one or more column values. It is often used in combination with aggregate functions like COUNT(), SUM(), AVG(), MAX(), and MIN() to perform calculations on grouped data. The GROUP BY clause is useful when you want to: Make calculations and aggregations on subsets of data. Generate summary statistics and metrics for different groups or categories. Identify patterns and trends within specific groups. Generate reports and analyze data based on different dimensions or attributes. Apply filters and conditions on grouped data, using the HAVING clause In summary, GROUP BY is used to organize and summarize data based on specific columns, functions or expressions, which will allow you to gain insights and perform calculations on distinct groups within a dataset. When to Use the COUNT() Function COUNT() is one of SQL’s most common aggregate functions. It returns the number of rows that match a specified condition or are included in a result set. It is often used to retrieve the total number of records in a table or to calculate the number of occurrences of a particular value within a column. 5 Examples of Using COUNT() with GROUP BY Now that we’ve gone through the basic scenarios where COUNT() and GROUP BY are used, let’s go over some of the more complicated examples. We’ll start off with simpler examples and work our way up to more complex scenarios. Example #1: GROUP BY a Single Column The simplest scenario you could encounter is when you need to GROUP BY a single column. In the following example, we need to find out how our company’s employee count is distributed across different job titles. Before seeing the SQL solution to this scenario, let’s go over the sample data. This is the employees table: EmployeeIDFirstNameLastNameJobTitle 1JohnDoeManager 2JaneSmithSupervisor 3MarkJohnsonDeveloper 4EmilyWilliamsAnalyst 5MichaelBrownDesigner 6SarahDavisDeveloper 7RobertWilsonDesigner 8JessicaTaylorDeveloper By using the COUNT function with GROUP BY on the JobTitle column, we can get a breakdown of the number of employees in each specific role. You can see the query and the output (based on the sample data) below: SELECT JobTitle, COUNT(*) AS NumberOfEmployees FROM employees GROUP BY JobTitle; JobTitleNumberOfEmployees Analyst1 Designer2 Developer3 Manager1 Supervisor1 This example works the same way as our initial query. GROUP BY puts the rows for employees with the same job title into one group. Then the COUNT() function counts the rows in each group. GROUP BY then collapses the rows in each group, keeping only the value of the column JobTitle and the count. Example #2: GROUP BY Multiple Columns Of course, you can group rows by more than one column. In this example, we will look at a sample orders table containing basic order information: OrderIDCustomerIDProductIDProductCategoryOrderDateStatusAmount 11011001Electronics2023-05-01Completed150.00 21021002Clothing2023-05-02Completed80.00 31011001Home Goods2023-06-03In progress60.00 4103NULLAccessories2023-06-03Canceled200.00 51011002Electronics2023-07-04NULL120.00 61021001NULL2023-07-05NULLNULL 71031002Clothing2023-07-06In progress90.00 81021002Accessories2023-08-07NULL75.00 9103NULLNULL2023-08-08NULL100.00 101011001Home Goods2023-09-09NULLNULL 111021001Home Goods2023-06-05In progress80.00 121031004Accessories2023-06-06Completed75.00 131021005Electronics2023-08-06Completed88.00 We need to write a query that will show the number of orders placed by each customer and the ProductCategory of that order. This means that we will have to return the CustomerID and the category that the order falls within. The query will look like this: SELECT CustomerID, ProductCategory, COUNT(*) AS NumberOfOrders FROM orders GROUP BY CustomerID, ProductCategory; And the result of running this query can be seen below: CustomerIDProductCategoryNumberOfOrders 101Electronics2 101Home Goods2 102Electronics2 102Accessories2 102Clothing1 103Accessories2 103Clothing1 103NULL1 Our query groups rows by two columns: CustomerID and ProductCategory. This means that GROUP BY groups rows with the same values of CustomerID and ProductCategory into one group. (Rows for CustomerID 101 and the category Electronics are in one group, but rows for CustomerID 101 and the category Home Goods are in a different group.) Then the rows in each group are counted by COUNT(). Example #3: Using WHERE with COUNT() and GROUP BY Our previous example analyzed a scenario where we wanted to create an aggregate of all of the information in our orders table. But sometimes, we might want to slice into this information and only see orders for selected categories. Using the same sample data that we had before, we will now write a query that shows the same information for orders that fall within the “Accessories” or “Clothing” ProductCategory. To do this, we can use the query from Example 2 and just add a WHERE clause. This clause will filter for records where ProductCategory is equal to “Accessories” or “Clothing”. SELECT CustomerID, ProductCategory, COUNT(*) AS NumberOfOrders FROM orders WHERE ProductCategory IN (‘Accessories’, ‘Clothing’) GROUP BY CustomerID, ProductCategory; CustomerIDProductCategoryNumberOfOrders 102Accessories2 103Accessories1 102Clothing1 103Clothing1 If it’s not entirely intuitive how the database has generated the results using the query above, here’s a step-by-step explanation of what happened behind the scenes: First, the database scans the orders table and reads all the rows. It then applies the filtering condition in WHERE ProductCategory IN (‘Accessories’, ‘Clothing’ ) to filter the rows. After this step, only the rows where the product category is “Accessories” or “Clothing” are considered for further processing. The filtered rows are then grouped based on the values in the CustomerID and ProductCategory columns, which are specified in the GROUP BY clause. For each of the unique combinations of CustomerID and ProductCategory, the COUNT(*) function is applied. This will count the number of rows within each group. The final result set includes the CustomerID, ProductCategory and the count of orders (shown in the NumberOfOrders column) for each group. In summary, the database will filter the rows that respect the specified filtering condition. It will then group them according to the specified columns in the GROUP BY clause, and then calculate the count of orders within each of those groups. The final output will include the CustomerID, ProductCategory and the corresponding count of orders for each unique combination of CustomerID and ProductCategory. Example #4: Using ORDER BY with COUNT() and GROUP BY Looking at the result from example number 4, we can see that some of the rows in the output are mixed. This is because the columns in the GROUP BY list will generally require – but not guarantee – sorting the output based on the list of columns in the GROUP BY section. But if, for example, we need to organize the results on different conditions ( e.g. a timeline or by date) we will need to use an ORDER BY clause: SELECT CustomerID, ProductCategory, COUNT(*) AS NumberOfOrders FROM orders WHERE ProductCategory IN (‘Accessories’, ‘Electronics’) GROUP BY CustomerID, ProductCategory ORDER BY ProductCategory, CustomerID; CustomerIDProductCategoryNumberOfOrdes 101Accessories2 102Accessories2 102Electronics1 103Electronics2 We’ve added the ORDER BY clause with the list of columns we want to sort the data by. As you can see, this sorts the output information based on the order of the columns listed. Example #5: COUNT(expression) So far, we’ve looked at simple examples of working with COUNT(); the purpose was to count all of the rows in the source dataset or table. However, there are more complex ways of using the COUNT–GROUP BY combination. To explain this, we’ll build a new sample dataset. We have a table called SurveyResponses that stores answers to a survey. Some of the questions are optional, which is why there are some respondents with NULL answer values; they’ve skipped the optional questions. You can see the sample data below: ResponseIDRespondentIDRespondentNameQuestionIDAnswer 1101John1Agree 2101John2No 3101John3Yes 4102Sarah1Yes 5102Sarah2Not Sure 6102Sarah3NULL 7103Mark1No 8103Mark2Maybe 9103Mark3No 10104Emily1Yes 11104Emily2Not Sure 12104Emily3Disagree COUNT(*) - Counting Rows The most common usage of the COUNT function (and its default functionality, even if it’s used with GROUP BY) is to count the number of rows. For example, if we want to count the answer types of survey questions, we can use the following query: SELECT COUNT(*) AS NumberOfYesAnswers, Answer FROM SurveyResponses GROUP BY Answer; The output will look like this: NumberOfAnswersAnswer 3Yes 3No 2Not Sure 1Agree 1Disagree 1Maybe 1NULL The outcome of this is a count of all rows that match a certain answer. It returns the aggregate number of survey answers and the type of the answer. Counting NON-NULL Values in a Column Let’s look at another example that might seem to generate the same results. Actually, this example has something particular and important about it; instead of using * as a parameter in our COUNT() function, we’re using COUNT() with a column name. We will modify the query in the previous example. Instead of the * parameter for the COUNT() function, we will replace it with the Answer column. Our new query looks like this: SELECT COUNT(Answer) AS NumberOfAnswers, AnswerFROM SurveyResponses GROUP BY Answer; If we run this new query, we will see that the results returned are almost exactly the same: NumberOfAnswersAnswer 3Yes 3No 2Not Sure 1Agree 1Disagree 1Maybe 0NULL We can see that the output for the NULL answer value has changed from 1 to 0. This is because the COUNT() function only takes into consideration non-NULL values when doing the aggregation. Before we were doing COUNT(*), which implicitly means count the rows; COUNT(Answer) will count the values in the Answer column. And because we had 1 value with NULL, it will skip those values in its calculation – returning 0 in this second scenario. Counting Distinct NON-NULL Values in a Column In this third variation of using the COUNT function, we will use the same query as in our previous example. This time, though, we will add the DISTINCT keyword before the column name. SELECT Answer, COUNT(DISTINCT Answer) AS DistinctCount FROM SurveyResponses GROUP BY Answer; StatusDistinctCount Yes1 No1 Not Sure1 Agree1 Disagree1 Maybe1 NULL0 We can see in the output above that the result of this query has turned all of the positive values in the DistinctCount column to 1. Adding the DISTINCT keyword means that each time the COUNT function finds a new value it hasn’t seen before, it will add that status to its list and add a 1 to its count. However, if it finds the same status value a second time or more, it will skip counting it. Want to Know More About COUNT() and GROUP BY? Using COUNT() with GROUP BY is only one of the many powerful features that SQL offers. If you’re eager to explore SQL further and truly master its capabilities, I encourage you to check out our comprehensive SQL Basics course. If you’re looking for SQL practice, I recommend our SQL Practice track, which has over 600 SQL practice exercises. After taking our courses, maybe you feel like you want to find a job working with SQL. To help you prepare for your interview, we have a list of SQL interview questions that will show you some real-world examples of questions and problems in SQL. Tags: GROUP BY aggregate functions numerical functions