10 reasons for which we should learn and use Row_Number Function of Sql Server 2005.

Niladri.Biswas
Posted by in Sql Server category on for Beginner level | Points: 250 | Views : 9573 red flag
Rating: 4.33 out of 5  
 3 vote(s)

In this tutorial, we will see as how to use Row_Number function with lots of practicals.

Table of Content

  1. Introduction
  2. Objective
  3. Daily Row_Number usage
    1. Generate sequential numbers on the fly
    2. Helps in Pagination
    3. Helps to find duplicates
    4. Helps to find out rows from a selected range
    5. Helps to generate number table on the fly with recursive CTE
    6. An alternate for TOP clause
    7. An alternate for Distinct clause
    8. Helps to sort record and insert into view from the source
    9. Helps to find appropriate record when use in conjunction with Aggregate function treated as Analytical function
    10. Helps to find the last inserted row
  4. References
  5. Conclusion

Introduction

Though too late in writing an article on this topic, still I feel that it's worth to have so.Because in our day to day activities, we use this function so much that it indeed needs some special attention and in this article we will look into as how in our regular programing area the Row_Number function, which got introduce in Sql Server 2005 has make a permanent place in helping the developers to write queries in an easy and comfortable way that seemed to be a little harder when we had to simulate the same in lower versions.

Objective

In this tutorial we will look into the day to day usages (at least the one which I encounter so far....It can be even more so suggestions are always welcome) of Row_Number function with examples.This article will not measure the performance of the two approaches but will show at least one example of doing the same in Sql Server 2000 and how easy it is to do the same using Row_Number function of Sql Server 2005.

Daily Row_Number usage

In the following paras, we will look into the facts as where Row_Number finds it's usage

1. Generate sequential numbers on the fly

Suppose we have a table as under

Declare @t Table(Name Varchar(20),Age int)
Insert Into @t 
Select 'Name1', 20 Union All Select 'Name2', 21 Union All 
Select 'Name3', 22 Union All Select 'Name4', 23 Union All
Select 'Name5', 24

Select *
From @t

/* Result */
Name	Age
Name1	20
Name2	21
Name3	22
Name4	23
Name5	24

Objective: To generate sequential numbers on the fly

Sql Server 2000 approach:

Case 1: Using Co-Related Subquery and an Aggregate Function

Select 
(
	Select Count(*) 
	From @t t2
    Where t2.Name <= t1.Name 
    And t2.Age <= t1.Age
 ) As Rn
 ,t1.Name
 ,t1.Age
From @t t1

OR

Select 
(
	Select Max(1)
	From @t t2
    Where t2.Name <= t1.Name 
    And t2.Age <= t1.Age
 ) As Rn
 ,t1.Name
 ,t1.Age
From @t t1

OR

Select 
(
	Select Sum(1)
	From @t t2
    Where t2.Name <= t1.Name 
    And t2.Age <= t1.Age
 ) As Rn
 ,t1.Name
 ,t1.Age
From @t t1

/* Result */
Rn	Name	Age
1	Name1	20
2	Name2	21
3	Name3	22
4	Name4	23
5	Name5	24

Case 2: Using identity column

Declare @t Table(Rn Int Identity(1,1), Name Varchar(20),Age int)
Insert Into @t 
Select 'Name1', 20 Union All Select 'Name2', 21 Union All 
Select 'Name3', 22 Union All Select 'Name4', 23 Union All
Select 'Name5', 24

Select *
From @t

/* Result */
Rn	Name	Age
1	Name1	20
2	Name2	21
3	Name3	22
4	Name4	23
5	Name5	24

Case 3: Using inner join

Select
	Rn = COUNT(*) 
	,t1.Name
	,t1.Age
From @t t1
Join @t t2 On  t1.Name >= t2.Name
Group By t1.Name,t1.Age
Order By  t1.Name,t1.Age

/* Result */
Rn	Name	Age
1	Name1	20
2	Name2	21
3	Name3	22
4	Name4	23
5	Name5	24

Sql Server 2005 Row_Number approach

Select Rn =Row_Number() Over(Order by (Select 1)),t.*
From @t t

/* Result */
Rn	Name	Age
1	Name1	20
2	Name2	21
3	Name3	22
4	Name4	23
5	Name5	24

2. Helps in Pagination

Paging is a very common implementation in most of the applications for displaying the records. Now this can be done either at the client side application or at the server side application. But doing so in the client side will increase the load on the client application as fetching the whole records set and keeping them into the memory, then choosing the records within the range will give a serious performance impact. On the other hand, if it can be done at the database side, then the client application will get only those records in which they will be interested in at that point of time and hence the client application’s performance will boost.

Let us first set up the environment for performing the demonstration.We will populate 100 records to a tblPagingExperiment table

--Drop the table tblPagingExperiment if it exists
IF OBJECT_ID('tblPagingExperiment','U') IS NOT NULL BEGIN
	DROP TABLE tblPagingExperiment
END
GO

-- Create the table
Create Table tblPagingExperiment (
	[Person ID]	Int Identity
	,[Person Name] Varchar(100)
	,Age Int
	,DOB Datetime
	,Address Varchar(100) 
)
GO

-- Populate 100 data to the table
Insert into tblPagingExperiment  
Select 
		'Person Name' + CAST(Number AS VARCHAR)
		, Number
		,DATEADD(D,Number, '1900-01-01')
		,'Address' + CAST(Number AS VARCHAR)
From master..spt_values 
Where Type = 'p' 
And Number Between 1 and 100

-- Project the records
Select * 
From tblPagingExperiment

Objective: To skip first 10 rows and display next 20 rows.i.e. from row number 11 to 30

Paging using Sql Server 2000 approach

-- Variable to hold the offset value
Declare @RowSkip As int
-- Variable to hold the fetch value
Declare @RowFetch As int

--Set the value of rows to skip
 Set @RowSkip = 10
--Set the value of rows to fetch
 Set @RowFetch = 20

--If the #Temp object exists in the tempdb, then drop it
IF OBJECT_ID('tempdb..#Temp') IS NOT NULL
BEGIN
Drop Table #Temp 
END

--Create a temporary table
Create Table #Temp
(
	Rn int Identity
	,[Person ID] int
	,[Person Name] Varchar(50)
	,Age int
	,DOB datetime
	,Address Varchar(100) 
)

-- Insert the records into the Temporary table
Insert Into #Temp([Person ID],[Person Name],Age,DOB,Address)
Select  [Person ID],[Person Name],Age,DOB,Address
From tblPagingExperiment

-- Fetch the records from 11 to 30
Select 
	[Person ID]
	,[Person Name]
	,Age
	,DOB
	,Address  
From #Temp
Where Rn Between  (@RowSkip+1) And  (@RowSkip+ @RowFetch) 

Paging using Sql Server 2005 Row_Number approach

-- Variable to hold the offset value
Declare @RowSkip As int
-- Variable to hold the fetch value
Declare @RowFetch As int

--Set the value of rows to skip
 Set @RowSkip = 10
--Set the value of rows to fetch
 Set @RowFetch = 20

;With Cte As
 (
	Select 
		rn=ROW_NUMBER() 
		Over(Order by (Select 1) /* generating some dummy column*/ )
		,*
	From tblPagingExperiment
 )
-- Fetch the records from 11 to 30
Select 
	[Person ID]
	,[Person Name]
	,Age
	,DOB
	,Address  
From Cte
Where rn Between  (@RowSkip+1)And  (@RowSkip+ @RowFetch) 

/* Result */
Person ID	Person Name	Age	DOB	Address
11	Person Name11	11	1900-01-12 00:00:00.000	Address11
12	Person Name12	12	1900-01-13 00:00:00.000	Address12
13	Person Name13	13	1900-01-14 00:00:00.000	Address13
14	Person Name14	14	1900-01-15 00:00:00.000	Address14
15	Person Name15	15	1900-01-16 00:00:00.000	Address15
16	Person Name16	16	1900-01-17 00:00:00.000	Address16
17	Person Name17	17	1900-01-18 00:00:00.000	Address17
18	Person Name18	18	1900-01-19 00:00:00.000	Address18
19	Person Name19	19	1900-01-20 00:00:00.000	Address19
20	Person Name20	20	1900-01-21 00:00:00.000	Address20
21	Person Name21	21	1900-01-22 00:00:00.000	Address21
22	Person Name22	22	1900-01-23 00:00:00.000	Address22
23	Person Name23	23	1900-01-24 00:00:00.000	Address23
24	Person Name24	24	1900-01-25 00:00:00.000	Address24
25	Person Name25	25	1900-01-26 00:00:00.000	Address25
26	Person Name26	26	1900-01-27 00:00:00.000	Address26
27	Person Name27	27	1900-01-28 00:00:00.000	Address27
28	Person Name28	28	1900-01-29 00:00:00.000	Address28
29	Person Name29	29	1900-01-30 00:00:00.000	Address29
30	Person Name30	30	1900-01-31 00:00:00.000	Address30

3. Helps to find duplicates

Consider the below table

Declare @t Table(name varchar(20), email varchar(20))
insert into @t values ('name1', '1stEmail@email.com')
insert into @t values ('name1', '2ndEmail@email.com')
insert into @t values ('name1', '3rdEmail@email.com')
insert into @t values ('name2', '1stEmail@email.com')
insert into @t values ('name3', '1stEmail@email.com')
insert into @t values ('name3', '2ndEmail@email.com')
insert into @t values ('name4', '1stEmail@email.com')

Select *
From @t

/* Result */
name	email
name1	1stEmail@email.com
name1	2ndEmail@email.com
name1	3rdEmail@email.com
name2	1stEmail@email.com
name3	1stEmail@email.com
name3	2ndEmail@email.com
name4	1stEmail@email.com

Sql server 2000 approach to find the duplicate email entries

SELECT name,  EmailCount = count(email)
FROM @t
GROUP BY name
HAVING count(*) > 1

/* Result */
name	EmailCount
name1	3
name3	2

Sql server 2005 Row_Number approach to find the duplicate email entries i.e. Persons having multiple emails

Select distinct name,count(email)+1
From
	(
		Select 
			Rn = Row_Number() Over(Partition By name Order By (Select 1))
			,name,email
		From @t
	)x
Where Rn > 1
Group By name
/* Result */
name	EmailCount
name1	3
name3	2

We are using the partition by clause to perform the partition which will yield the below

Rn	name	email
1	name1	1stEmail@email.com
2	name1	2ndEmail@email.com
3	name1	3rdEmail@email.com
1	name2	1stEmail@email.com
1	name3	1stEmail@email.com
2	name3	2ndEmail@email.com
1	name4	1stEmail@email.com

From the result generated we can make out that, Rn with value > 1 has multiple entries.

4. Helps to find out rows from a selected range

Suppose we have a table as under

Declare @t Table(Name Varchar(20),Age int)
Insert Into @t 
Select 'Name1', 20 Union All Select 'Name2', 21 Union All 
Select 'Name3', 22 Union All Select 'Name4', 23 Union All
Select 'Name5', 24

Select *
From @t

/* Result */
Name	Age
Name1	20
Name2	21
Name3	22
Name4	23
Name5	24

Objective: To select the third and fourth row

Sql Server 2000 approach:

Select Name,Age 
From(
		Select 
		(
			Select Count(*) 
			From @t t2
			Where t2.Name <= t1.Name 
			And t2.Age <= t1.Age
		 ) As Rn
		 ,t1.Name
		 ,t1.Age
		From @t t1
	)x
Where x.Rn Between 3 And 4

/* Result */
Name	Age
Name3	22
Name4	23

Sql Server 2005 Row_Number approach:

Select Name,Age 
From(		
	  Select Rn = Row_Number() Over(Order By (select 1)),*
	  From @t
	)x
Where x.Rn Between 3 And 4

First we are generating the sequence in the inner query and then filtering them out in the outer query

5. Helps to generate number table on the fly with recursive CTE

Sometimes we need to generate a Number table or Tally table. The importance of that can be found here

In Sql Server 2000, we use to do so by using While loop

--If the #Temp object exists in the tempdb, then drop it
IF OBJECT_ID('tempdb..#Temp') IS NOT NULL
BEGIN
Drop Table #Temp 
END

--Create a temporary table
Create Table #Temp
(
	Number int 
)

Declare @i int,@maxLimit int
Set @i = 1
Set @maxLimit = 100

-- Polutae the table using while loop
While( @i <= @maxLimit)
Begin
	Insert Into #Temp(Number) Values(@i)
	Set @i = @i + 1
End

Select *
From #Temp

Sql Server 2005 Row_Number approach with Recursive CTE to generate Tally/Number table:

;With NumTable As
(
	Select Rn = Row_Number() Over ( Order By (Select 1))
	Union All
	Select Rn + 1
	From NumTable Where Rn < 100
)
Select *
From NumTable

6. An alternate for TOP clause

To get the Top N records, we use the TOP fucntion for achieving so till Sql Server 2000

Declare @t Table(Name Varchar(20),Age int)
Insert Into @t 
Select 'Name1', 20 Union All Select 'Name2', 21 Union All 
Select 'Name3', 22 Union All Select 'Name4', 23 Union All
Select 'Name5', 24

Select Top 2 *
From @t

/* Result */
Name	Age
Name1	20
Name2	21

Fetches the Top 2 records

We can use Row_Numnber as an alternative way to of TOP

Select Name,Age 
From(		
	  Select Rn = Row_Number() Over(Order By (select 1)),*
	  From @t
	)x
Where x.Rn <= 2

/* Result */
Name	Age
Name1	20
Name2	21

7. An alternate for Distinct clause

Consider the below table

Declare @t Table(Name Varchar(20),Age int)
Insert Into @t 
Select 'Name1', 20 Union All Select 'Name1', 20 Union All 
Select 'Name2', 22 Union All Select 'Name2', 22 Union All
Select 'Name3', 22 Union All Select 'Name3', 23

Select *
From @t

/* Result */
Name	Age
Name1	20
Name1	20
Name2	22
Name2	22
Name3	22
Name3	23

Sql server 2000 approach to get distinct records

Select 
	Distinct Name,Age 
From @t

/* Result */
Name	Age
Name1	20
Name1	21
Name2	22
Name3	22
Name3	23

Sql server 2005 Row_Number approach to simulate the Distinct clause

Select Name,Age
From
(Select 
	Rn = Row_Number() Over(Partition By Name,Age Order By (Select 1))
	,Name,Age
From @t)x
Where Rn = 1
/* Result */
Name	Age
Name1	20
Name1	21
Name2	22
Name3	22
Name3	23

We are using the partition by clause to perform the partition which will yield the below

Rn	Name	Age
1	Name1	20
1	Name1	21
1	Name2	22
2	Name2	22
1	Name3	22
1	Name3	23

From the result generated we can make out that, Rn with value 1 are the only unique records.

8. Helps to sort record and insert into view from the source

Suppose we want to insert some record into a View from a physical table after performing a sort operation as shown under

Create View TestView
As
Select Geo_Id, Description  
From dbo.Geo
Order By Description

It throws error

Msg 1033, Level 15, State 1, Procedure TestView, Line 5 The ORDER BY clause is invalid in views, inline functions, derived tables, subqueries, and common table expressions, unless TOP or FOR XML is also specified.

We can eliminate this by using a Row_Number funciton at this juncture

IF  EXISTS (SELECT * FROM sys.views WHERE object_id = OBJECT_ID(N'[dbo].[TestView]'))
DROP VIEW [dbo].[TestView]
GO

Create View TestView
As

Select Geo_Id, Description
From(
		Select 
		Rn = Row_Number() Over(Order By Description)
		,Geo_Id
		, Description  
		From dbo.Geo
	)x

GO
-- Project the records
Select *
From TestView

9. Helps to find appropriate record when use in conjunction with Aggregate function treated as Analytical function

Suppose we have a requirement where we need to calculate teh points earned by an individual in a game as display the Name,TotalPointsEarned and EmailAddress as shown under

Declare @t Table(Name varchar(20), PointsEarned int,EmailAddress varchar(50))
insert into @t values ('name1', 30, '1stEmail@email.com')
insert into @t values ('name1', 10, '2ndEmail@email.com')
insert into @t values ('name1', 50, '3rdEmail@email.com')
insert into @t values ('name2',60, '1stEmail@email.com')
insert into @t values ('name3',70, '1stEmail@email.com')
insert into @t values ('name3',10, '2ndEmail@email.com')
insert into @t values ('name4',34, '1stEmail@email.com')

Select 
	Name
	,TotalPointEarned = Sum(PointsEarned) Over(Partition By Name)
	,EmailAddress
From @t

/* Result */
Name	TotalPointEarned	EmailAddress
name1	90	1stEmail@email.com
name1	90	2ndEmail@email.com
name1	90	3rdEmail@email.com
name2	60	1stEmail@email.com
name3	80	1stEmail@email.com
name3	80	2ndEmail@email.com
name4	34	1stEmail@email.com

As can be figure out that though the expected result is obtained, but the records are been repeated.Inorder to get the desired output, we need the help of Row_Number function as under

;With Cte As
(
	Select 
		Rn = Row_Number() Over(Partition By Name Order By(Select 1))
		,Name
		,TotalPointEarned = Sum(PointsEarned) Over(Partition By Name)
		,EmailAddress
	From @t
)
Select 
	Name
	,TotalPointEarned
	,EmailAddress
From Cte
Where Rn = 1

/* Result */
Name	TotalPointEarned	EmailAddress
name1	90	1stEmail@email.com
name2	60	1stEmail@email.com
name3	80	1stEmail@email.com
name4	34	1stEmail@email.com

Inside the CTE we use the Row_Number function to get the records partitioned by Name where we are able to figure out the dulplicate entries(Rn > 1)

Rn	Name	TotalPointeEarned	EmailAddress
1	name1	90	1stEmail@email.com
2	name1	90	2ndEmail@email.com
3	name1	90	3rdEmail@email.com
1	name2	60	1stEmail@email.com
1	name3	80	1stEmail@email.com
2	name3	80	2ndEmail@email.com
1	name4	34	1stEmail@email.com

We just filteredd out those records outside the Cte

N.B.~ We have seen that, only the first Emails has come in the output and leaving the others.Interesting readers will find the solution to do the same.Leaving it as an exercise.Hint: Email should be comma separated.

10. Helps to find the last inserted row

Select LastInsertedRow = Max(Rn)
From(
		Select Rn = Row_Number()Over(Order By (Select 1))
		From @t
	)x

Other alternatives may be MAX,Count(*),Output.Inserted,Scope_Identity(), @@Identity etc.

References

ROW_NUMBER

Conclusion

So in this tutorial, we have seen atleast ten usages of Row_Number() function.Hope this will be helpful for many people.I am not claiming that there cannot be more applicability of this function.Astute readers please drop a line of comment of other usages for this function and I will include that in this list. Thanks for reading

Page copy protected against web site content infringement by Copyscape

About the Author

Niladri.Biswas
Full Name: Niladri Biswas
Member Level: Platinum
Member Status: Member
Member Since: 10/25/2010 11:04:24 AM
Country: India
Best Regards, Niladri Biswas
http://www.dotnetfunda.com
Technical Lead at HCL Technologies

Login to vote for this post.

Comments or Responses

Posted by: Ogipansrk on: 8/6/2012 | Points: 25
Good Work , Biswas!
Posted by: Deviprasads on: 8/11/2012 | Points: 25
Again a nice article from you!

Case 1: Using Co-Related Subquery and an Aggregate Function :
Declare @t Table(Name Varchar(20),Age int)

Insert Into @t
Select 'Name1', 20 Union All Select 'Name2', 21 Union All
Select 'Name3', 22 Union All Select 'Name4', 23 Union All
Select 'Name5', 24

Select
(
Select max(1) From @t t2
Where t2.Name <= t1.Name And t2.Age <= t1.Age) As Rn
,t1.Name
,t1.Age
From @t t1

This above query will return with RN=1 for all the rows in result table.

Login to post response

Comment using Facebook(Author doesn't get notification)