Basic GROUP BY Usage
Published: Jul 29, 2013
This is a key phrase when referring to any aggregate function as all aggregate functions require grouping in order for the data to have any meaning. As such this is a mandatory key phrase to be used within all aggregation.

As a simple example to demonstrate this we will consider a basic table which shows the recent outcomes of several rugby matches:

if object_id('tempDB..#results') is not null drop table #results
go

create table #results
(
  
teamName varchar(5),
  
isWin tinyint,
  
isLose tinyint,
  
isDraw tinyint,
  
points as ((isWin * 3) + isDraw)
)
insert into #results select 'Team1', 1, 0, 0
insert into #results select 'Team1', 1, 0, 0
insert into #results select 'Team1', 0, 0, 1
insert into #results select 'Team2', 0, 1, 0
insert into #results select 'Team2', 1, 0, 0
insert into #results select 'Team2', 0, 1, 0
insert into #results select 'Team3', 0, 1, 0
insert into #results select 'Team3', 0, 1, 0
insert into #results select 'Team3', 0, 0, 1
insert into #results select 'Team4', 0, 0, 1
insert into #results select 'Team4', 1, 0, 0
insert into #results select 'Team4', 0, 0, 1


Now, let’s say we wanted to know how many wins, losses, and draws each team had. Well to do this we would simply need to sum each column relevant to each team. Translated into SQL this means we want to sum up each column, grouped by team. Hence we would use the following:

select teamName, SUM(isWin) noWins, SUM(isLose) noLosses,
      
SUM(isDraw) noDraws, SUM(points) noPoints
from #results
group by teamName


As you can see, the group by clause is at the end of the query in this case. There are only 2 circumstances in which the group by clause is not at the end of the query and they are when you are using ORDER BY or when you are using HAVING.

Order by is pretty self explanatory. Here we will order our results by the total number of points (highest to lowest… “descending order”):

select teamName, SUM(isWin) noWins, SUM(isLose) noLosses,
      
SUM(isDraw) noDraws, SUM(points) noPoints
from #results
group by teamName
order by noPoints desc


Having is a little more advanced and is used almost like a where clause, but on the aggregate functions. Basically a where clause acts on the initial table… so if you wanted just team1 you would add “where teamName = ‘team1’” as this constraint works on the table itself. However, if you want to have, for example, only teams who have scored more than 4 points then a where clause will not work as this value is not present in the table, it is only present as an aggregate. Therefore to accommodate this we have the “having” clause which is actioned after the group by has taken place and therefore is written after the group by clause for clarity although, as with all keywords, it comes before the order by and must refer to the aggregate, not any given alias (in this case it must refer to “sum(points)” and not to the alias “noPoints”)…

select teamName, SUM(isWin) noWins, SUM(isLose) noLosses,
      
SUM(isDraw) noDraws, SUM(points) noPoints
from #results
group by teamName
having SUM(points) > 4
order by noPoints desc


Leave a Comment
Your email address will not be published. All fields are mandatory.
NB: Comments will only appear once they have been moderated.

Your SQL Trainer
Kevin Urquhart

I am a SQL Server DBA, Architect, Developer, and Trainer for SQL Training. This is my blog in which I’m simply trying to share my SQL knowledge and experiences with the world.

Kevin Urquhart
SQL Server Consultant, London

Categories


Archives


Copyright © 2018 London SQL Server Training Courses @ SQL Training | Website Design From Scott Heron