29th Jun 2023 10 minutes read

How to Use COUNT() with GROUP BY: 5 Practical Examples

Table of Contents

How to Use COUNT() and GROUP BY
- When to Use GROUP BY
- When to Use the COUNT() Function
5 Examples of Using COUNT() with GROUP BY
Want to Know More About COUNT() and GROUP BY?

Using the COUNT() function with GROUP BY is one of the most common SQL constructs in aggregate queries. Read this article to find out how to use COUNT() with GROUP BY correctly using 5 examples.

In this article, we will explain the importance of using COUNT with GROUP BY. We’ll talk about why it is essential in SQL and how it enables data analysis and summarization based on specific criteria. This combination empowers users to extract meaningful insights, calculate counts, and generate statistical summaries from large datasets.

If you’re looking for an in-depth review of basic SQL concepts like COUNT() and GROUP BY, I recommend our interactive SQL Basics course. It contains 129 exercises, which will help you review all key SQL concepts.

How to Use COUNT() and GROUP BY

If you’re here just for the quick answer, here’s the TLDR:

The correct way of using COUNT() with GROUP BY is shown in the query below:

SELECT 
  Store,
  COUNT(*) as NumberOfSales, 
FROM Sales
GROUP BY Store;

Let's break down the result to understand how this query works. The rows with the same value in the Store column are grouped together. Imagine an intermediate table where these rows are grouped and marked with different colors, like the image below. This would be our intermediary table containing only the Store column, since that is the column that is part of our SELECT statement.

Store	ProductID	CustomerID
Store A	165	7
Store A	111	16
Store A	145	25
Store B	136	9
Store B	111	38
Store C	136	16
Store C	111	8

The database then logically counts the number of rows in each group using the COUNT(*) function. This count represents the number of orders for each store.

Once the rows are counted, there is no need to have duplicate rows with the same Store value. So, GROUP BY will reduce the number of rows to only unique values. In other words, it will get rid of the individual rows and give us a summary of each group. We end up with the following result:

Store	NumberOfSales
Store A	3
Store B	2
Store C	2

When to Use GROUP BY

GROUP BY is an SQL clause that groups rows based on one or more column values. It is often used in combination with aggregate functions like COUNT(), SUM(), AVG(), MAX(), and MIN() to perform calculations on grouped data.

The GROUP BY clause is useful when you want to:

Make calculations and aggregations on subsets of data.
Generate summary statistics and metrics for different groups or categories.
Identify patterns and trends within specific groups.
Generate reports and analyze data based on different dimensions or attributes.
Apply filters and conditions on grouped data, using the HAVING clause

In summary, GROUP BY is used to organize and summarize data based on specific columns, functions or expressions, which will allow you to gain insights and perform calculations on distinct groups within a dataset.

When to Use the COUNT() Function

COUNT() is one of SQL’s most common aggregate functions. It returns the number of rows that match a specified condition or are included in a result set. It is often used to retrieve the total number of records in a table or to calculate the number of occurrences of a particular value within a column.

5 Examples of Using COUNT() with GROUP BY

Now that we’ve gone through the basic scenarios where COUNT() and GROUP BY are used, let’s go over some of the more complicated examples. We’ll start off with simpler examples and work our way up to more complex scenarios.

Example #1: GROUP BY a Single Column

The simplest scenario you could encounter is when you need to GROUP BY a single column. In the following example, we need to find out how our company’s employee count is distributed across different job titles.

Before seeing the SQL solution to this scenario, let’s go over the sample data. This is the employees table:

EmployeeID	FirstName	LastName	JobTitle
1	John	Doe	Manager
2	Jane	Smith	Supervisor
3	Mark	Johnson	Developer
4	Emily	Williams	Analyst
5	Michael	Brown	Designer
6	Sarah	Davis	Developer
7	Robert	Wilson	Designer
8	Jessica	Taylor	Developer

By using the COUNT function with GROUP BY on the JobTitle column, we can get a breakdown of the number of employees in each specific role. You can see the query and the output (based on the sample data) below:

SELECT 
  JobTitle, 
  COUNT(*) AS NumberOfEmployees
FROM employees
GROUP BY JobTitle;

JobTitle	NumberOfEmployees
Analyst	1
Designer	2
Developer	3
Manager	1
Supervisor	1

This example works the same way as our initial query. GROUP BY puts the rows for employees with the same job title into one group. Then the COUNT() function counts the rows in each group. GROUP BY then collapses the rows in each group, keeping only the value of the column JobTitle and the count.

Example #2: GROUP BY Multiple Columns

Of course, you can group rows by more than one column.

In this example, we will look at a sample orders table containing basic order information:

OrderID	CustomerID	ProductID	ProductCategory	OrderDate	Status	Amount
1	101	1001	Electronics	2023-05-01	Completed	150.00
2	102	1002	Clothing	2023-05-02	Completed	80.00
3	101	1001	Home Goods	2023-06-03	In progress	60.00
4	103	NULL	Accessories	2023-06-03	Canceled	200.00
5	101	1002	Electronics	2023-07-04	NULL	120.00
6	102	1001	NULL	2023-07-05	NULL	NULL
7	103	1002	Clothing	2023-07-06	In progress	90.00
8	102	1002	Accessories	2023-08-07	NULL	75.00
9	103	NULL	NULL	2023-08-08	NULL	100.00
10	101	1001	Home Goods	2023-09-09	NULL	NULL
11	102	1001	Home Goods	2023-06-05	In progress	80.00
12	103	1004	Accessories	2023-06-06	Completed	75.00
13	102	1005	Electronics	2023-08-06	Completed	88.00

We need to write a query that will show the number of orders placed by each customer and the ProductCategory of that order. This means that we will have to return the CustomerID and the category that the order falls within.

The query will look like this:

SELECT
  CustomerID,
  ProductCategory,
  COUNT(*) AS NumberOfOrders
FROM orders
GROUP BY CustomerID, ProductCategory;

And the result of running this query can be seen below:

CustomerID	ProductCategory	NumberOfOrders
101	Electronics	2
101	Home Goods	2
102	Electronics	2
102	Accessories	2
102	Clothing	1
103	Accessories	2
103	Clothing	1
103	NULL	1

Our query groups rows by two columns: CustomerID and ProductCategory. This means that GROUP BY groups rows with the same values of CustomerID and ProductCategory into one group. (Rows for CustomerID 101 and the category Electronics are in one group, but rows for CustomerID 101 and the category Home Goods are in a different group.) Then the rows in each group are counted by COUNT().

Example #3: Using WHERE with COUNT() and GROUP BY

Our previous example analyzed a scenario where we wanted to create an aggregate of all of the information in our orders table. But sometimes, we might want to slice into this information and only see orders for selected categories.

Using the same sample data that we had before, we will now write a query that shows the same information for orders that fall within the “Accessories” or “Clothing” ProductCategory.

To do this, we can use the query from Example 2 and just add a WHERE clause. This clause will filter for records where ProductCategory is equal to “Accessories” or “Clothing”.

SELECT
  CustomerID,
  ProductCategory,
  COUNT(*) AS NumberOfOrders
FROM orders
WHERE ProductCategory IN (‘Accessories’, ‘Clothing’)
GROUP BY CustomerID, ProductCategory;

CustomerID	ProductCategory	NumberOfOrders
102	Accessories	2
103	Accessories	1
102	Clothing	1
103	Clothing	1

If it’s not entirely intuitive how the database has generated the results using the query above, here’s a step-by-step explanation of what happened behind the scenes:

First, the database scans the orders table and reads all the rows.
It then applies the filtering condition in WHERE ProductCategory IN (‘Accessories’, ‘Clothing’ ) to filter the rows. After this step, only the rows where the product category is “Accessories” or “Clothing” are considered for further processing.
The filtered rows are then grouped based on the values in the CustomerID and ProductCategory columns, which are specified in the GROUP BY clause.
For each of the unique combinations of CustomerID and ProductCategory, the COUNT(*) function is applied. This will count the number of rows within each group.
The final result set includes the CustomerID, ProductCategory and the count of orders (shown in the NumberOfOrders column) for each group.

In summary, the database will filter the rows that respect the specified filtering condition. It will then group them according to the specified columns in the GROUP BY clause, and then calculate the count of orders within each of those groups. The final output will include the CustomerID, ProductCategory and the corresponding count of orders for each unique combination of CustomerID and ProductCategory.

Example #4: Using ORDER BY with COUNT() and GROUP BY

Looking at the result from example number 4, we can see that some of the rows in the output are mixed. This is because the columns in the GROUP BY list will generally require – but not guarantee – sorting the output based on the list of columns in the GROUP BY section.

But if, for example, we need to organize the results on different conditions ( e.g. a timeline or by date) we will need to use an ORDER BY clause:

SELECT
  CustomerID,
  ProductCategory,
  COUNT(*) AS NumberOfOrders
FROM orders
WHERE ProductCategory IN (‘Accessories’, ‘Electronics’)
GROUP BY CustomerID, ProductCategory
ORDER BY ProductCategory, CustomerID;

CustomerID	ProductCategory	NumberOfOrdes
101	Accessories	2
102	Accessories	2
102	Electronics	1
103	Electronics	2

We’ve added the ORDER BY clause with the list of columns we want to sort the data by. As you can see, this sorts the output information based on the order of the columns listed.

Example #5: COUNT(expression)

So far, we’ve looked at simple examples of working with COUNT(); the purpose was to count all of the rows in the source dataset or table.

However, there are more complex ways of using the COUNT–GROUP BY combination. To explain this, we’ll build a new sample dataset.

We have a table called SurveyResponses that stores answers to a survey. Some of the questions are optional, which is why there are some respondents with NULL answer values; they’ve skipped the optional questions. You can see the sample data below:

ResponseID	RespondentID	RespondentName	QuestionID	Answer
1	101	John	1	Agree
2	101	John	2	No
3	101	John	3	Yes
4	102	Sarah	1	Yes
5	102	Sarah	2	Not Sure
6	102	Sarah	3	NULL
7	103	Mark	1	No
8	103	Mark	2	Maybe
9	103	Mark	3	No
10	104	Emily	1	Yes
11	104	Emily	2	Not Sure
12	104	Emily	3	Disagree

COUNT(*) - Counting Rows

The most common usage of the COUNT function (and its default functionality, even if it’s used with GROUP BY) is to count the number of rows. For example, if we want to count the answer types of survey questions, we can use the following query:

SELECT COUNT(*) AS NumberOfYesAnswers, Answer
FROM SurveyResponses
GROUP BY Answer;

The output will look like this:

NumberOfAnswers	Answer
3	Yes
3	No
2	Not Sure
1	Agree
1	Disagree
1	Maybe
1	NULL

The outcome of this is a count of all rows that match a certain answer. It returns the aggregate number of survey answers and the type of the answer.

Counting NON-NULL Values in a Column

Let’s look at another example that might seem to generate the same results. Actually, this example has something particular and important about it; instead of using * as a parameter in our COUNT() function, we’re using COUNT() with a column name.

We will modify the query in the previous example. Instead of the * parameter for the COUNT() function, we will replace it with the Answer column. Our new query looks like this:

SELECT 
  COUNT(Answer) AS NumberOfAnswers, 
  AnswerFROM SurveyResponses
GROUP BY Answer;

If we run this new query, we will see that the results returned are almost exactly the same:

NumberOfAnswers	Answer
3	Yes
3	No
2	Not Sure
1	Agree
1	Disagree
1	Maybe
0	NULL

We can see that the output for the NULL answer value has changed from 1 to 0. This is because the COUNT() function only takes into consideration non-NULL values when doing the aggregation.

Before we were doing COUNT(*), which implicitly means count the rows; COUNT(Answer) will count the values in the Answer column. And because we had 1 value with NULL, it will skip those values in its calculation – returning 0 in this second scenario.

Counting Distinct NON-NULL Values in a Column

In this third variation of using the COUNT function, we will use the same query as in our previous example. This time, though, we will add the DISTINCT keyword before the column name.

SELECT 
  Answer, 
  COUNT(DISTINCT Answer) AS DistinctCount
FROM SurveyResponses
GROUP BY Answer;

Status	DistinctCount
Yes	1
No	1
Not Sure	1
Agree	1
Disagree	1
Maybe	1
NULL	0

We can see in the output above that the result of this query has turned all of the positive values in the DistinctCount column to 1. Adding the DISTINCT keyword means that each time the COUNT function finds a new value it hasn’t seen before, it will add that status to its list and add a 1 to its count. However, if it finds the same status value a second time or more, it will skip counting it.

Want to Know More About COUNT() and GROUP BY?

Using COUNT() with GROUP BY is only one of the many powerful features that SQL offers. If you’re eager to explore SQL further and truly master its capabilities, I encourage you to check out our comprehensive SQL Basics course. If you’re looking for SQL practice, I recommend our SQL Practice track, which has over 600 SQL practice exercises.

After taking our courses, maybe you feel like you want to find a job working with SQL. To help you prepare for your interview, we have a list of SQL interview questions that will show you some real-world examples of questions and problems in SQL.

Tags: