29th Oct 2020 Updated: 6th Nov 2024 6 minutes read What is COUNT(*), COUNT(1), COUNT(column), and COUNT(DISTINCT) in SQL? Tihomir Babic GROUP BY aggregate functions Table of Contents What Does the COUNT() Function do? COUNT(*) vs COUNT(1) COUNT(*) vs COUNT(column name) COUNT(column name) vs COUNT (DISTINCT column_name) Where to Find More Practice Have you noticed there are different variations of the SQL COUNT() function? This article explains the various arguments and their uses. As a SQL user, you’re probably quite familiar with the COUNT() function. Even though it’s relatively simple, it can be used in several different ways. Each way has a very different use. I imagine you’ve seen code that contains the function COUNT(*) or COUNT(1). You’ve probably also seen some other uses of the COUNT() function, such as COUNT(column name) and COUNT(DISTINCT column name), even if you haven’t used them. You’re probably wondering what each variation of COUNT() does. Let’s find out! The best way to practice SQL is our interactive SQL Practice Set course. It contains over 90 hands-on SQL exercises to review and refresh the most important SQL topics, including COUNT() and GROUP BY. With each exercise solved you build confidence in your SQL skills. What Does the COUNT() Function do? The COUNT() function counts. But what does it count? The COUNT() function belongs to SQL’s aggregate functions. It counts the number of rows that satisfy the criteria defined in the query. It does not return the rows themselves; it shows the number of rows that meet your criteria. There are different things that can be counted. That’s why there are different variations of the COUNT() function. In this article, I’ll concentrate on four: COUNT(*) COUNT(1) COUNT(column name) COUNT(DISTINCT column name) COUNT(*) vs COUNT(1) You may have seen various discussions about the differences between COUNT(*) and COUNT(1). And maybe trying to find the answer confused you even more. So, is there any difference? The simple answer is no – there is no difference at all. The COUNT(*) function counts the total rows in the table, including the NULL values. The semantics for COUNT(1) differ slightly; we’ll discuss them later. However, the results for COUNT(*) and COUNT(1) are identical. Let’s test this claim using an example query. Suppose I have a table named orders that contains these columns: order_id: The ID of the order. customer_id: The ID of the customer who placed the order. order_value: The total value of the ordered items, in euros. payment_date: When the order was paid by the customer. Let's compare the use of COUNT(*) and COUNT(1) in one query: SELECT COUNT(*) AS count_asterisk, COUNT(1) AS count_1 FROM orders; Here's the result: count_asterisk count_1 8 8 Both versions return the same number of rows. There’s a popular misconception that “1” in COUNT(1) means “count the values in the first column and return the number of rows.” From that misconception follows a second: that COUNT(1) is faster because it will count only the first column, while COUNT(*) will use the whole table to get to the same result. This is not true. So what does the value in the parenthesis of COUNT() mean? It’s the value that the COUNT() function will assign to every row in the table. Of course, it will be assigned a number of times that’s equal to the number of rows in the table. In other words, COUNT(1) assigns the value from the parentheses (number 1, in this case) to every row in the table, then the same function counts how many times the value in the parenthesis (1, in our case) has been assigned; naturally, this will always be equal to the number of rows in the table. The parentheses can contain any value; the only thing that won’t work will be leaving the parentheses empty. Since it doesn’t matter which value you put in the parentheses, it follows that COUNT(*) and COUNT(1) are precisely the same. They are precisely the same because the value in the COUNT() parentheses serves only to tell the query what it will count. There’s no difference in the performance. Don’t let the asterisk (*) make you think it has the same use as in SELECT * statement. No, COUNT(*) will not go through the whole table before returning the number of rows, making itself slower than COUNT(1). So, in the end, who wins in this dramatic COUNT(*) vs COUNT(1) battle? Nobody – it’s a draw; they’re exactly the same. However, I’d recommend using COUNT(*), as it’s much more commonly seen. It’s also less confusing, naturally leading other SQL users to understand that the function will count all the numbers in the table, including the NULL values. COUNT(*) vs COUNT(column name) How about this one, COUNT(*) vs COUNT(column name). Is there any difference? There sure is! As you’ve already learned, COUNT(*) will count all the rows in the table, including NULL values. On the other hand, COUNT(column name) will count all the rows in the specified column while excluding NULL values. As you already know, there are eight rows in the table orders. Let’s see how many rows there will be when I use the column order_id for counting (imagining I want to see how many orders have been placed). We’ll get eight rows again, right? Let’s see: SELECT COUNT(order_id) AS number_of_orders FROM orders; Do we get the same result? Nope, there are seven orders, not eight. number_of_orders 7 Is that a mistake? No, it’s not; there are really only seven orders with an order_id; one row has a NULL instead of a proper order_id. Below is the row that makes the difference: order_idcustomer_idorder_priceorder_date NULLCU0921327.85NULL Always remember: COUNT(column name) will only count rows where the given column is NOT NULL. The principles of combining GROUP BY and COUNT() are outlined in this article about GROUP BY and SQL aggregate functions. If you want some more practice, here are five examples of GROUP BY. COUNT(column name) vs COUNT (DISTINCT column_name) You can probably imagine what the difference between those two COUNT() function versions is. COUNT(column_name) will include duplicate values when counting. In contrast, COUNT (DISTINCT column_name) will count only distinct (unique) rows in the defined column. If you want to count the number of customers who’ve placed an order, maybe COUNT (column_name) will work. Let’s try this simple code: SELECT COUNT (customer_id) AS number_of_customers FROM orders; You’re familiar with this one; I’ve already used the COUNT(column name) function. This time it counts all rows in the column customer_id, with the result being shown in the column number_of_customers. Here’s the result: number_of_customers 8 Let’s check the result by looking at the entire orders table: order_idcustomer_idorder_priceorder_date OR2020-01CU108154872020-01-08 OR2020-28CU149154872020-01-14 OR2020-12CU10812549.222020-01-09 OR2020-91CU012542.55NULL NULLCU0921327.85NULL OR2020-112CU049150002020-02-28 OR2020-213CU0521502020-03-12 OR2020-213CU0522002020-03-12 There are eight rows, but is this really the number of the customers? Notice that the customers CU108 and CU052 appear twice. If I want the real number of customers, then I need to count every customer only once. How can I do this? By using COUNT(DISTINCT customer_id): SELECT COUNT(DISTINCT customer_id) AS number_of_customers FROM orders; This query will also count rows in the column customer_id, but it will count every customer only once. This is due to the keyword DISTINCT. Have a look at the result: number_of_customers 6 This is the correct result; there are really only six unique customers. Where to Find More Practice Now that you understand several common variations of the COUNT() function, you can create more complex calculations and reports. COUNT() is one of the most used aggregate functions, so it’s vital that you clearly understand the different COUNT() variations and their purposes. For more practice using COUNT(), try our interactive Creating Basic SQL Reports course that's focused on advanced use of GROUP BY and aggregate functions. Tags: GROUP BY aggregate functions