Change Tracking aka CT in SQL server 2008 R2

Vishvvas
Posted by in Sql Server category on for Intermediate level | Points: 250 | Views : 12164 red flag

This article explores one of the features (of SQL server 2008 and higher) of providing the change tracking abilities i.e. what data has changed.

Introduction and background


Change tracking which means tracking the changes in data is an important need of modern day applications and especially the enterprise systems. Change tracking addresses the needs to query the changes to data and the access (security) information associated with the data changes, precisely the nature of change and who did it. The popular mechanism of implementing is by custom way i.e. addition of following columns in each of the database table

1.     Added By :-   The information about the user who added the record

2.  AddedDateTime: The timestamp column for catching the exact datetime at which the record addition took place

3.   UpdatedBy: The information about the user who updated i.e. edited and deleted the record

4.   UpdatedDateTime: The timestamp column for catching the exact datetime at which the record updating took place

The other approached were creation of new table for storing the tracking information and using the combination of triggers and stored procedures or GUI component to put such information into these tables.

All these approaches have pros and cons but one thing is sure it requires a lot of work to be done and also it adds maintenance overhead.

The change tracking is characterized by

1.       Change in row(record) or not: Whether the row has changed and recording of such change (achieved simultaneously or in the same transaction)

2.       Which rows changed: which rows has changed and latest data of such changes or transaction rather than the intermediate values of multiple changes to same record/row

For recording the intermediate changes CDC i.e. change tracking feature is more useful. The information is available at http://www.dotnetfunda.com/articles/article1924-cdc-in-sql-server-2008-r2-part-i.aspx and http://www.dotnetfunda.com/articles/article1926-cdc-in-sql-server-2008-r2-part-ii.aspx.

The custom way of achieving such tracking involves lot of work and also it becomes maintenance overhead.

To address such issues, Microsoft has come up with more sophisticated and atomic approach called CT- Change Tracking. It provides ability to record all the changes caused be any DML statement that affects. This feature is particularly handy in one way or two way synchronization applications. Also this is built to work with ADO.NET synch services.


Objective

To learn about the CT feature of SQL server (version 2008 and higher) To learn about the CT feature of SQL server (version 2008 and higher) .


Description

Note: For the examples, the schema for Employee table from sample database i.e. AdvntureWorks in SQL 2008 R2 is used (after some changes in structure). For demonstration, the scripts are also attached.

Setting up CT has few administrative tasks to be performed on the database for who CT is to be implemented.

Let’s create a database called “TrialDB”.

USE master;

go

IF EXISTS (SELECT * FROM sys.databases WHERE name = 'TrialDB')

      DROP DATABASE TrialDB;

go

CREATE DATABASE TrialDB;

Go

Pre-requisites

1.       The compatibility level of the database shall be 90 or more. The database compatibility level can be sensed as following


If the level is lesser than 90, the function CHANGETABLE needed for querying the change tracking information would return the error although the change tracking can be configured.

1.       Enabling the SNAPSHOT isolation level helps to keep the change tracking information consistent.

--SETTING SNAPSHOT ISOALATION LEVELS

ALTER DATABASE TrialDB

    SET READ_COMMITTED_SNAPSHOT ON;

GO

ALTER DATABASE TrialDB

SET ALLOW_SNAPSHOT_ISOLATION ON

GO

This can be done through SQL Server Management Studio by using the Database Properties (ChangeTracking Page) dialog box.

The next step is to enable the CT for this database.

USE TrialDB;

GO

Alter Database TrialDB

Set Change_Tracking = ON

(Auto_CleanUP=ON, CHANGE_RETENTION=3 Days)

CHANGE_RETENTION is meant to specify the time period for which the tacking information be kept. The information would be removed periodically after such period. The AUTO_CLEANUP option is meant for either enabling or disabling the automatic clean-up of old tracking information

When READ_COMMITTED_SNAPSHOT  is enabled the read operations don’t block the update operations and all queries running under this isolation use row versioning.

Is it all done? Not yet, the CT is enabled for database but it has to be enabled for individual tables too.

Let’s create a table called “Employee”. This table originally belongs to database “AdventureWorks”.

SET ANSI_NULLS ON

GO

SET QUOTED_IDENTIFIER ON

GO

CREATE TABLE [Employee](

      [EmployeeID] [int] IDENTITY(1,1) NOT NULL,

      [NationalIDNumber] [nvarchar](15) NOT NULL,

      [ContactID] [int] NOT NULL,

      [LoginID] [nvarchar](256)  NULL,

      [ManagerID] [int] NULL,

      [Title] [nvarchar](50)  NULL,

      [BirthDate] [datetime]  NULL,

      [MaritalStatus] [nchar](1)  NULL,

      [Gender] [nchar](1)  NULL,

      [HireDate] [datetime]  NULL, 

      [VacationHours] [smallint]  NULL,

      [SickLeaveHours] [smallint]  NULL,   

      [rowguid] [uniqueidentifier] ROWGUIDCOL   NULL,

      [ModifiedDate] [datetime]  NULL,

 CONSTRAINT [PK_Employee_EmployeeID] PRIMARY KEY CLUSTERED (

      [EmployeeID] ASC

)WITH (PAD_INDEX  = OFF, STATISTICS_NORECOMPUTE  = OFF, IGNORE_DUP_KEY = OFF, ALLOW_ROW_LOCKS  = ON, ALLOW_PAGE_LOCKS  = ON) ON [PRIMARY]

) ON [PRIMARY]

GO

Once the table is created, the CT can be enabled as shown below.

--ENABLING THE CHANGE TRACKING ON EMPLOYEE TABLE

ALTER TABLE Employee

ENABLE CHANGE_TRACKING WITH (TRACK_COLUMNS_UPDATED = ON)

GO

Once the change tracking is enabled, we can verify whether the change tracking is indeed enabled or not and also know the table name for which it is enabled.


The SQL server creates and internal table for tracking with nomenclature “change_tracking_<table_object_id>”. Such table can’t be queried directly but SQL server exposes a set of functions to help retrieve the information about the changes.

Let’s try to understand what this query means for us

CHANGE_TRACKING_CURRENT_VERSION: This function is devised to return the current version number at the database level, possibly the highest number for change information

CHANGE_TRACKING_MIN_VALID_VERSION: This function returns the minimum version after the information for a table change has been retained or lowest number for a table change information

CHANGETABLE: This function is used to retrieve change information after version 0. Since we are yet to  perform any DML operations after enabling Change Tracking, no records are returned.

Let’s perform some DML operations

INSERT:  Following query would insert a record into Employee table

INSERT INTO [TrialDB].[DBO].[Employee]

           ([NationalIDNumber]

           ,[ContactID]

           ,[LoginID]

           ,[ManagerID]

           ,[Title]

           ,[BirthDate]

           ,[MaritalStatus]

           ,[Gender]

           ,[HireDate]   

           ,[VacationHours]

           ,[SickLeaveHours]   

          )

     VALUES

           (14417810

           ,1212

           ,'trial\guy1'

           ,16

           ,'Production Technician - WC60'

           ,NULL

           ,'M'

           ,'M'

           ,GETDATE()          

           ,21

           ,30

           )

So this INSERT operation shall be captured and we can confirm this with following query.

SYS_CHANGE_OPERATION denotes the database operation

1: D: DELETE

2: I: INSERT and UPDATE

Let’s run the UPDATE query as following

UPDATE [TrialDB].[DBO].[Employee]

 SET [NationalIDNumber]= 14417815

 WHERE [NationalIDNumber]= 14417810

The change tracking details are

As seen, current version is 2 and change operation is “I” which is same for INSERT and UPDATE.

Retrieving all such DML changes is little tricky. It is as shown in the screenshot

 

It shows one record  ahd change tracking version as 2. So it is essentially showing the latest version as the for INSERT and UPDATE , it is treated as INSERT.

DELETE: We would delete the first record and see how it is captured.

DELETE FROM [TrialDB].[DBO].[Employee]

 WHERE [EmployeeID]=3

Following are the results for change information and changes.

This shows the latest change i.e. deletion and as the columns referred in select list as shown as NULL because it is deleted.

The last but not the least is to know which columns changed. This operation is also tricky in a sense that it needs a join between SQL server internal table and the table for which the changes are tracked.

DECLARE @PreviousVersion bigint = 0

SELECT    CTTable.EmployeeID,    CTTable.SYS_CHANGE_OPERATION,   Emp.NationalIDNumber,    Emp.ContactID,    Emp.Gender,

   [ContactIDChanged?] = CHANGE_TRACKING_IS_COLUMN_IN_MASK(COLUMNPROPERTY(OBJECT_ID('Employee'),

   'ContactID', 'ColumnId'), SYS_CHANGE_COLUMNS),   [NationalIDNumberChanged?] = CHANGE_TRACKING_IS_COLUMN_IN_MASK(COLUMNPROPERTY(

                          OBJECT_ID('Employee'),'NationalIDNumber', 'ColumnId')

                        , SYS_CHANGE_COLUMNS),

   [Gender?] = CHANGE_TRACKING_IS_COLUMN_IN_MASK(COLUMNPROPERTY(OBJECT_ID(

                          'Employee'),'Gender', 'ColumnId')

                        , SYS_CHANGE_COLUMNS)

 FROM CHANGETABLE (CHANGES Employee, @PreviousVersion) AS CTTable

  LEFT OUTER JOIN Employee AS Emp

 ON emp.EmployeeID = CTTable.EmployeeID

Following screenshot shows the query and results both.

 


The change information for column like ‘'NationalIDNumber'’, ‘'ContactID'’ and ‘'Gender'’ is asked for and of course after deletion these columns have changed.

Summary and Conclusion

The change tracking in SQL server is better alternative for custom solutions for tracking the database changes and also can be really useful in synchronization/ replication scenarios. The retrieval of change information is little tricky and needs some work of joining the internal tables and the user table for which the change tracking is enabled.

 We saw a practical example and also went through database operations and the capture of changes.

This looks like very promising feature in case where information about the last changes can matter and it would be easy to choose for at least new project. For existing systems, there is little one time overhead for setting up the CT but it would be worth vis-à-vis custom and cumbersome solutions.

HAPPY PROGRAMMING!!!

Reference

http://msdn.microsoft.com/en-us/library/bb933875.aspx

http://msdn.microsoft.com/en-us/library/bb964713.aspx

http://msdn.microsoft.com/en-us/library/bb933874.aspx

http://msdn.microsoft.com/en-us/library/bb933875.aspx

http://msdn.microsoft.com/en-us/library/bb964713.aspx

http://msdn.microsoft.com/en-us/library/bb933874.aspx

Page copy protected against web site content infringement by Copyscape

About the Author

Vishvvas
Full Name: Vishwas Sutar
Member Level: HonoraryPlatinum
Member Status: Member,MVP
Member Since: 5/30/2011 2:13:10 AM
Country: India

http://www.dotnetfunda.com
Extensive and rich experience across gamut of technologies and programming languages like PB,VB,C++,VB.NET, C#, Classic ASP,ASP.NET, ASP.NET MVC.

Login to vote for this post.

Comments or Responses

Login to post response

Comment using Facebook(Author doesn't get notification)