Calculating Cumulative Totals
Published: Apr 01, 2016
Following on from my last post really, this was another aspect of the MI I was helping out with which benefitted from the use of some more modern code (SQL 2012 or above) in order to make code both clearer and more concise. It is also a lot faster and less strain on the server using the newer method.

This involves using Window functions. These are basically functions acting on a set of data rather than row by row. They’ve been around for a while in most other RDBMS systems, but only in SQL Server as of 2012. Probably why I’ve not seen them used very much as a good few companies I see are still on SQL 2008 R2.

Anyway, they’re very powerful functions which can remove the need for otherwise complex code. The example I’m specifically highlighting here is the ability to quickly and easily calculate a running total.

Here I’m just using a small dataset from the AdventureWorks2012 database to show you what I mean.

This is one traditional way to calculate such a value:

declare @interim table
(
  
salesOrderID int,
  
fullName varchar(100),
  
totalDue money
)
insert into @interim
select SalesOrderID, p.firstName + ' ' + p.LastName as fullName, TotalDue
from sales.SalesOrderHeader s
join sales.vSalesPerson p
on s.SalesPersonID = p.BusinessEntityID
where orderDate >= '2005-07-01'
and orderDate < '2005-10-01'
and salesPersonID = 279

select salesOrderID, fullName, totalDue,
   (
      
select sum(totalDue) cumulative
      
from @interim
      
where salesOrderID <= i.salesOrderID
  
)
from @interim i


This gives us the running totals that we are looking for:

So how can we do this using the newer Window Functions?

select SalesOrderID, p.firstName + ' ' + p.LastName as fullName, TotalDue,
      
sum(totalDue) over(order by salesOrderID rows between unbounded preceding and current row) as cumulativeSales
from sales.SalesOrderHeader s
join sales.vSalesPerson p
on s.SalesPersonID = p.BusinessEntityID
where orderDate >= '2005-07-01'
and orderDate < '2005-10-01'
and salesPersonID = 279
order by salesOrderID


It’s much more concise and clearer to read.

Basically this tells us that we are summing the values, ordered by SalesOrderID, between a range… that range being “unbounded preceding and current row” which means every SalesOrderID prior to the current row inclusive.

This is not only neater, but is also much faster.

If we remove the WHERE clause from both statements (meaning that we will calculate cumulative values across the entire Sales.SalesOrderHeader table) then we can see the following in Profiler:

As you can clearly see, the older style of code was a factor of 10 slower than using a Window Function and with considerably more reads and CPU cycles as well.

It’s also worth noting that this was only over a table of around 4000 records. Running this over a production size table and the differences get exponentially larger.

Therefore this is definitely something to consider when you’re next asked to produce a cumulative calculation.
Leave a Comment
Your email address will not be published. All fields are mandatory.
NB: Comments will only appear once they have been moderated.

Your SQL Trainer
Kevin Urquhart

I am a SQL Server DBA, Architect, Developer, and Trainer for SQL Training. This is my blog in which I’m simply trying to share my SQL knowledge and experiences with the world.

Kevin Urquhart
SQL Server Consultant, London

Categories


Archives


Copyright © 2018 London SQL Server Training Courses @ SQL Training | Website Design From Scott Heron