In this article we will learn Percentile_Disc Function of SQL Server 2012 (Denali)
Introduction
Sql Server 2012 has brought a lot of new functions for the T-Sql developers. In this article we will look into Percentile_Disc Function.
Purpose
An inverse distribution function which takes a percentile value and a sort specification and returns an element from the set.It accepts any numeric data-type or any non-numeric data-type that can be implicitly converted to a numeric data-type as it's argument and returns the same data-type as the numeric data-type of the argument.
Syntax
Percentile_Disc ( numeric_literal ) within group(Order by sort_expression [asc|desc]) OVER ([partition_by_clause])
Logic behind Percentile_Disc
Well this function can be defined as under
Percentile_Disc = Cumulative_Distribution * N
Where,
N = Total number of rows or records.
e.g. We have a set with 10 elements. We need to find the Percentile_Disc of the 7th row.Let us first find the Cumulative_distribution as
So, Cumulative_Distribution = (7)/(10) = 0.7
So, Percentile_Disc = Cumulative_Distribution * N = 0.7 * 10 = 7
The range of values return by this function is between 0 to 1 inclusive. The return data type is always a positive number.
A case study
Let us first create the environment
IF EXISTS (SELECT * FROM sys.objects WHERE name = N'MatchTable' AND type = 'U')
DROP TABLE MatchTable
GO
SET ANSI_NULLS ON
GO
--Create the table
CREATE TABLE [dbo].[MatchTable](
[MatchID] [int] IDENTITY(1,1) NOT NULL,
[MatchGroup] [varchar](8) NULL,
[MatchBetween] [varchar](50) NULL,
[ScheduleDate] [date] NULL
) ON [PRIMARY]
GO
--Insert records
Insert Into MatchTable Values
('Group-A','India VS Australia','08/14/2011')
,('Group-A','India VS Pakistan','08/15/2011')
,('Group-A','India VS Newzealand','08/16/2011')
,('Group-A','Australia VS Pakistan','08/17/2011')
,('Group-A','Australia VS Newzealand','08/18/2011')
,('Group-A','Newzealand VS Pakistan','08/19/2011')
,('Group-B','USA VS WestIndies','08/20/2011')
,('Group-B','USA VS Ireland','08/21/2011')
,('Group-B','USA VS Bangaladesh','08/22/2011')
,('Group-B','WestIndies VS Ireland','08/23/2011')
,('Group-B','WestIndies VS Bangaladesh','08/24/2011')
,('Group-B','Ireland VS Bangaladesh','08/25/2011')
-- Project the records
Select * From MatchTable

Now let us run the below query
SELECT [MatchID]
,[MatchGroup]
,[MatchBetween]
,[ScheduleDate]
,PercentileDisc =PERCENTILE_DISC(.6) WITHIN GROUP (ORDER BY MatchID)OVER(PARTITION BY [MatchGroup])
,Cume_Dist() Over(Partition By [MatchGroup] Order By MatchID) As CumilativeDistribution
,PercentileDiscByFormula = Cume_Dist() Over(Partition By [MatchGroup] Order By MatchID) * (6)
FROM MatchTable

Now let us analyze the 4th Record i.e. Match Id = 4. We have a total of 6 records in Group-A hance N = 6. And the cumulative distribution was 0.666666666666667.
Now if we put these values into the formula, we get
Percentile_Disc = 0.666666666666667 * 6 = 4.000000000000002 ~ = 4
How to implement the same in SQL SERVER 2005/2008?
Here is an attempt to simulate Percentile_Disc function in SQL Server 2005/2008
;With Count_CTE AS
(
Select [MatchGroup] ,N = COUNT([MatchGroup])
From MatchTable
Group By [MatchGroup]
)
,Rank_CTE AS
(
Select
[MatchID]
,[MatchGroup]
,[MatchBetween]
,[ScheduleDate]
,R = Rank() Over(Partition By [MatchGroup] Order by MatchID)
From MatchTable
)
Select r.[MatchID]
,r.[MatchGroup]
,r.[MatchBetween]
,r.[ScheduleDate]
,CumilativeDistribution = CAST((r.R) AS DECIMAL(10,2)) / CAST((c.N ) AS DECIMAL(10,2))
,TotalRecord = c.N
,PercentileDisc = Round((CAST((r.R) AS DECIMAL(10,2)) / CAST((c.N ) AS DECIMAL(10,2))) * c.N,0)
From Rank_CTE r
Join Count_CTE c On r.[MatchGroup] = c.[MatchGroup]
Result

First we are getting the Cumulative Distribution and Total Record Count by Match and then putting that into the formula, we are getting the result
You may need to refer to my Cumulative Distribution article for getting an understanding as what it is.
Conclusion
SQL Server 2012 (Denali) seems to be very mature and promising and has embedded with many new functions.In this article we have looked into the Percentile_Disc function, it's internal formula and how we can do the same in lower version.Hope the article will be useful.