In this article, we will look into the Cumulative Distribution function of SQL Server 2012 (Denali).
Introduction
Sql Server 2012 has brought a lot of new functions for the T-Sql developers. In this article we will look into Cumulative_Distribution Function.
Purpose
This function calculates the cumulative distribution of a value in a group of values within a group of rows.
Syntax
Cume_Dist() Over ([partition_by_clause] order_by_clause)
Logic behind Cumulative_Distribution
Well this function can be defined as under
Cumulative_Distribution = (R)/(N)
Where,
R => The row whose rank needs to be found
N => Total number of rows or records.
A case study
Let us first create the environment
IF EXISTS (SELECT * FROM sys.objects WHERE name = N'PlayerTable' AND type = 'U')
DROP TABLE PlayerTable
GO
SET ANSI_NULLS ON
GO
--Create the table
CREATE TABLE [dbo].[PlayerTable](
PlayerID INT IDENTITY(1001,1),
PlayerName VARCHAR(30),
BelongsTo VARCHAR(15),
Amount NUMERIC(16,2)
) ON [PRIMARY]
GO
--Insert records
INSERT INTO PlayerTable(PlayerName, BelongsTo, Amount) VALUES('Sampark Karmakar','India', 1000)
INSERT INTO PlayerTable(PlayerName, BelongsTo, Amount) VALUES('Hemanth Madan','India', 4000)
INSERT INTO PlayerTable(PlayerName, BelongsTo, Amount) VALUES('P.H. Pradhan','India',3600)
INSERT INTO PlayerTable(PlayerName, BelongsTo, Amount) VALUES('Dabasis Halder','India',5000)
INSERT INTO PlayerTable(PlayerName, BelongsTo, Amount) VALUES('Nuel Kuya','India', 3600)
INSERT INTO PlayerTable(PlayerName, BelongsTo, Amount) VALUES('Manjunath P','Afghanisthan',2000)
INSERT INTO PlayerTable(PlayerName, BelongsTo, Amount) VALUES('Hamim Mirza','Afghanisthan', 400)
INSERT INTO PlayerTable(PlayerName, BelongsTo, Amount) VALUES('Loly Gazia','Afghanisthan',8000)
INSERT INTO PlayerTable(PlayerName, BelongsTo, Amount) VALUES('Suresh Summer','Afghanisthan',1230)
INSERT INTO PlayerTable(PlayerName, BelongsTo, Amount) VALUES('J.June','Afghanisthan', 4988)
-- Project the records
Select * from PlayerTable
Result

Now let us run the below query
Select
SerialNumber = ROW_NUMBER() OVER(Order By GETDATE())
,PlayerID
,PlayerName
,BelongsTo
,Amount
,Cume_Dist() Over(Partition By BelongsTo Order By BelongsTo,PlayerID,Amount) As CumilativeDistribution
From PlayerTable
Result

Now let us understand the logic behind this. Take the example of the 4th Row where the PlayerId = 1009. There are a total of 5 rows for the [Belongs to] field “Afghanisthan”.
So , we have R = 4 (4th row) and N = 5 (Total number of records for "Afghanisthan").
Placing the values in our formula, we get
Percent_Rank = (4)/(5) = 3/4 = 0.8
How to implement the same in SQL SERVER 2005/2008?
Here is an attempt to simulate Cumulative Distribution function in SQL Server 2005/2008
;With Count_CTE AS
(
Select BelongsTo ,N = COUNT(BelongsTo)
From PlayerTable
Group By BelongsTo
)
,Rank_CTE AS
(
Select
SerialNumber = ROW_NUMBER() OVER(Order By GETDATE())
,PlayerID
,PlayerName
,BelongsTo
,Amount
,R = Rank() Over(Partition By BelongsTo Order by BelongsTo,PlayerID,Amount)
-- OR
-- R = ROW_NUMBER() Over(Partition By BelongsTo Order by BelongsTo,PlayerID,Amount)
From PlayerTable
)
Select r.*,c.N, CumilativeDistribution = CAST((r.R) AS DECIMAL(10,2)) / CAST((c.N ) AS DECIMAL(10,2))
From Rank_CTE r
Join Count_CTE c On r.BelongsTo = c.BelongsTo
Result

We are using the Rank function here that returns the rank of each row within the partition of a result set.In this case it can also be done using Row_Number function too.
References
Cumulative distribution function
Conclusion
SQL Server 2012 (Denali) seems to be very mature and promising and has embeeded with many new functions.In this article we have looked into the Cumulative Distribution function, it's internal formula and how we can do the same in lower version.Hope the article will be useful.
Thanks for reading